I just finished reading @guissmo 's article about degoogling. While the article itself was quite good, the difficulties of being online these days were making me feel a little depressed. Then I...

I just finished reading @guissmo 's article about degoogling. While the article itself was quite good, the difficulties of being online these days were making me feel a little depressed.

Then I read this:

Thus, to make the saying more accurate, you don't need an infinite number of monkeys (k-tuples) to produce the works of Shakespeare—2^524288 is ample.

I'm having a hard time describing the silly joyfulness I felt when reading this. It reminded me of the good moments that would not be possible (or at least, much less likely and certainly less frequent) without the internet.

There is a saying that if you had an infinite number of monkeys typing on an infinite number of typewriters, they would produce all the great works of literature (and an inconceivable amount of dreck, too). We can cast that scenario into the world of random number generation. Suppose we have a generator that outputs 32-bit values (i.e., four bytes), and we grab its output in chunks of 16384 values at once. Each chunk will thus be 64 KB in size. If we demand that the generator be 16384-dimensionally equidistributed, we can know that all possible 64 KB sequences of data must show up eventually over the full period of the generator, which must be at least 2^(16384×32) = 2^524288. Within that immensely huge collection of outputs lies every valid 64 KB zip file, some of which will contain great literature such as Hamlet. Thus, to make the saying more accurate, you don't need an infinite number of monkeys (k-tuples) to produce the works of Shakespeare—2^524288 is ample.

...

An argument against providing k-dimensional equidistribution (for large k) is that infinitesimal probabilities aren't worth worrying about. You probably aren't going to win the lottery, and your monkeys won't write Shakespeare.

But what if rather than blindly search for Shakespeare, we manufacture a generator that is just on the cusp of producing it. With the PCG family, that's possible!

...

We similarly have a generator poised on the verge of finding Hamlet. This time it's 3,141,592,653,589,793,238,463 steps away (about 1000 times further than Much Ado About Nothing), which is still very nearby compared to the expected distance of 2^65535 steps away for an arbitrary 65536-tuple (2^65535 is an insanely huge 19,729-digit number!):

...

There are two additonal generator states provided as well—one contains Romeo and Juliet somewhere within 2^72 steps of the provided state, and the other provides Twelfth Night somewhere within 2^80 steps.

I don't think you stand any chance at all of finding them, but I might be wrong. Feel free to have a go!

What does the full period of the generator mean? I feel that this description is missing "on average" somewhere?

Any good RNG ought to be well distributed across k dimensions (i.e., there should be no pattern to the k-tuples we see), but some users would prefer a much stronger property: that over the full period of the generator, every possible k-tuple will occur, and they will all occur the same number of times.

What does the full period of the generator mean?

I feel that this description is missing "on average" somewhere?

Random number generators have finite state. They’re a finite state machine where each state has one edge going to the next state. They will loop eventually after getting back to the same state....

Random number generators have finite state. They’re a finite state machine where each state has one edge going to the next state. They will loop eventually after getting back to the same state. The idea is to make the period so large that it will never happen in real programs. The more carefully designed ones will have some kind of proof that the loop is very long no matter what seed you start with. (But for some RNG’s, you have to pick a good seed; there are some starting points that are bad and will loop quickly.)

A random number generator with k-equidistribution has to emit each possible output the same number of times before repeating, so its period has to be very long to go through all the possibilities.

I just finished reading @guissmo 's article about degoogling. While the article itself was quite good, the difficulties of being online these days were making me feel a little depressed.

Then I read this:

I'm having a hard time describing the silly joyfulness I felt when reading this. It reminded me of the good moments that would not be possible (or at least, much less likely and certainly less frequent) without the internet.

You had me excited about new PCG tricks, but this is old news! That said, more folks should read the PCG papers - they’re very thorough.

From the article:

...

...

...

What does

the full period of the generatormean?I feel that this description is missing "on average" somewhere?

Random number generators have finite state. They’re a finite state machine where each state has one edge going to the next state. They will loop eventually after getting back to the same state. The idea is to make the period so large that it will never happen in real programs. The more carefully designed ones will have some kind of proof that the loop is very long no matter what seed you start with. (But for some RNG’s, you have to pick a good seed; there are some starting points that are bad and will loop quickly.)

A random number generator with k-equidistribution has to emit each possible output the same number of times before repeating, so its period has to be very long to go through all the possibilities.