# Difference between revisions of "Random number generator"

(modify C example, add some text) |
(massive update. ***I REQUEST PROOFREADING BY A NATIVE ENGLISH SPEAKER***) |
||

Line 3: | Line 3: | ||

Many players refer to the RNG as the '''Random Number God''', the entity inside the game that provides random functionality. This diety seems to leave great [[item]]s and easy [[monster]]s for some players, while granting seemingly-unfair deaths to others. Some players superstitiously avoid offending the Random Number God. | Many players refer to the RNG as the '''Random Number God''', the entity inside the game that provides random functionality. This diety seems to leave great [[item]]s and easy [[monster]]s for some players, while granting seemingly-unfair deaths to others. Some players superstitiously avoid offending the Random Number God. | ||

= Algorithms = | |||

All RNG algorithms start with an initial ''seed''. The algorithm uses the seed to compute its initial ''state''. To compute the next random number, the algorithm applies some formula to calculate the next state from the current state, then it uses the state to produce a number from a certain range. | All RNG algorithms start with an initial ''seed''. The algorithm uses the seed to compute its initial ''state''. To compute the next random number, the algorithm applies some formula to calculate the next state from the current state, then it uses the state to produce a number from a certain range. | ||

Line 12: | Line 12: | ||

If an eavesdropper (who does not know the seed) can use a sequence of outputs from the RNG to predict the next random number, then the RNG is not cryptographically secure. Conveniently, cryptographic security is not important for roguelikes. | If an eavesdropper (who does not know the seed) can use a sequence of outputs from the RNG to predict the next random number, then the RNG is not cryptographically secure. Conveniently, cryptographic security is not important for roguelikes. | ||

== Seeding == | |||

Some RNGs seed themselves differently each time, but other RNGs use the same default seed every time unless you remember to seed them. If every game uses the same seed, then it will roll the same character and (if the opening map is random) start at the same map every time! You can observe this if you intentionally set a particular seed during testing, or if the game has a bug that prevents it from setting the seed. For example, [http://sourceforge.net/tracker/index.php?func=detail&aid=1380333&group_id=9746&atid=109746 SLASH'EM bug 1380333] was the result of a game, in some configurations, seeding one random number generator but using another. | Some RNGs seed themselves differently each time, but other RNGs use the same default seed every time unless you remember to seed them. If every game uses the same seed, then it will roll the same character and (if the opening map is random) start at the same map every time! You can observe this if you intentionally set a particular seed during testing, or if the game has a bug that prevents it from setting the seed. For example, [http://sourceforge.net/tracker/index.php?func=detail&aid=1380333&group_id=9746&atid=109746 SLASH'EM bug 1380333] was the result of a game, in some configurations, seeding one random number generator but using another. | ||

A simple way to grow a different seed every game is to use the current time as a seed. This traditional method is barely enough to give the player a different experience every time. | A simple way to grow a different seed every game is to use the current time as a seed. This traditional method is barely enough to give the player a different experience every time. | ||

== RNG in programming languages | == Desired features == | ||

Pseudorandom number generators, in order to be considered "good", must offer the following: | |||

=== Speed === | |||

PRNG algorithms have different speeds. A Linear Congruential Generator (X<sub>n+1</sub> = (<i>a</i>X<sub>n</sub>+<i>b</i>) mod <i>m</i>) is probably the fastest one available. Other algorithms will offer lower speeds due to a greater number of calculations required to generate a number. | |||

=== Uniformity === | |||

Regrettably, not all algorithms offer a uniform distribution. For instance, if we had a random number generator returning two-bit numbers (range 0-3), using it as a base for creating numbers from 0 to 2 (<code>X=RNG4()%3;</code>) would make the value of 0 appear twice as often as any other. Also, congruential generators have a non uniform distribution. | |||

=== Periodicity === | |||

A PRNG's period is the number of random values needed to be generated in order to start repeating the same number progression again. Fortunately, a short period of 2<sup>32</sup> is usually enough for most uses. | |||

= RNG in programming languages = | |||

Most programming environments provide an RNG. Typically, a game will: | Most programming environments provide an RNG. Typically, a game will: | ||

# seed the RNG, if necessary | # seed the RNG, if necessary | ||

Line 24: | Line 40: | ||

#* one can convert the RNG's output to the correct range | #* one can convert the RNG's output to the correct range | ||

== C and C++ == | |||

In [[C]] and [[Cpp|C++]], the random number generator is the <tt>rand()</tt> function. One sets the seed with <tt>srand()</tt>. Some call these the "ANSI C" (or equivalently "ISO C") functions because they are part of the C standard and to distinguish them from other RNGs that some systems provide. | In [[C]] and [[Cpp|C++]], the random number generator is the <tt>rand()</tt> function. One sets the seed with <tt>srand()</tt>. Some call these the "ANSI C" (or equivalently "ISO C") functions because they are part of the C standard and to distinguish them from other RNGs that some systems provide. | ||

Line 51: | Line 67: | ||

If in doubt, you should implement a new RNG. However, you should '''not''' try to create your own algorithm, as it would almost certainly be even worse than the one you're trying to replace! Many people think the [[Mersenne twister]] is best, and there are existing implementations of it. | If in doubt, you should implement a new RNG. However, you should '''not''' try to create your own algorithm, as it would almost certainly be even worse than the one you're trying to replace! Many people think the [[Mersenne twister]] is best, and there are existing implementations of it. | ||

== Java == | |||

On the other hand, in [[Java]] the commonly used RNG resides in the Random class, which uppon initialized with a seed or the system time, returns pseudo-aleatory values of different primitive types. For example, you may invoke the method nextDouble() to get a value from 0.0D to 1.0D, which you may then scale to any range you need. | On the other hand, in [[Java]] the commonly used RNG resides in the Random class, which uppon initialized with a seed or the system time, returns pseudo-aleatory values of different primitive types. For example, you may invoke the method nextDouble() to get a value from 0.0D to 1.0D, which you may then scale to any range you need. | ||

Line 58: | Line 74: | ||

Most RL projects encapsulate or wrap the RNG functionality to allow exchangability of algorithms | Most RL projects encapsulate or wrap the RNG functionality to allow exchangability of algorithms | ||

== RNG algorithms == | = Criticisms = | ||

Pseudorandom number generators, or PRNGs, struggle against difficulties in generating numbers in as few CPU ticks as possible, while retaining a decent periodicity and quality. George Marsaglia created an application used for testing the quality of PRNG algorithms, called [http://www.cs.hku.hk/~diehard Diehard]. He then brought up criticisms against many commonly used algorithms, mainly the Multiplicative Congruential Generator (used in most rand() function implementations). His comments can be found, for instance, on comp.lang.c Usenet group. | |||

As Marsaglia has [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC285899/pdf/pnas00123-0038.pdf proven], MCGs create non-uniform pseudorandom numbers, falling into parallel hyperplanes. One of the examples of this undesired behaviour is the infamous [http://en.wikipedia.org/wiki/RANDU RANDU]. Additionally, most PRNGs of this type are predictable, as it is enough to observe 2 or 3 subsequently generated numbers to discover all the others. Also, this algorithm group has an undesired tendency to offer extremely short periods for lower order bits of the generated numbers (for instance, the last bit often has a period of 2<sup>0</sup>: it simply alternates between 0 and 1). | |||

The [[Mersenne twister]] was also criticised for being difficult to implement, even though it generates fast and high quality numbers. It also stores the last 624 generated numbers. Knowing this sequence can reveal all future iterates. This makes MT unsuitable for cryptography, although such a criticism does not apply in case of roguelike games. | |||

A PRNG's periodicity is often te subject of criticisms as well. While congruential generators have a periodicity at most equal to 2<sup>32</sup> on most 32 bit platforms (althought there ar examples: Java implements such a generator with a period of 2<sup>48</sup>, and Donald Knuth's MMIX implementation has a period of <sup>64</sup>), they often offer a ridiculously low <code>RAND_MAX</code> value, shortening the period to 2<sup>15</sup> (for instance, MSVC). Such a low periodicity is unacceptable in many cases. | |||

= RNG algorithms = | |||

Several RNG algorithms have been devised. Here is a brief list: | |||

== Mersenne Twister == | |||

A commonly used PRNG is the [[Mersenne twister]], a.k.a. MT19937, a recent, fast and algorithm of high quality. See [http://www.math.keio.ac.jp/~matumoto/emt.html] for more info about it. It has proven to pass George Marsaglia's Diehard PRNG tests and is commonly recommended as an excellent choice. It has a long period of exactly 2<sup>19937</sup>-1, from which the algorithm derives its name (2<sup>19937</sup>-1 is a Mersenne prime). | |||

== Linear Congruential Generator == | |||

Avoided by many due to its deficiencies, it can still be useful for some operation where neither periodicity nor uniform distribution aren't required. | |||

== Multiply-With-Carry == | |||

Excellent PRNG due to its simplicity, superior speed, high quality random numbers and long periods (a MWC256, storing the last 256 iterates, has a period of 2<sup>8222</sup>). | |||

== Complementary Multiply-With-Carry == | |||

Suggested by George Marsaglia, this PRNG is very fast, simple, generates very high quality numbers and has an extreme period. The CMWC4096, described [http://school.anhb.uwa.edu.au/personalpages/kwessen/shared/Marsaglia03.html here], stores the last 4096 iterates and has a near-record period of 2<sup>131104</sup>. In [[libtcod]], this algorithm has replaced the MT19937 as the default PRNG. | |||

== Generalised Feedback Shift Register == | |||

Also known as GFSR, it is a very fast generator with good randomness and relatively high periods. The basic concept is the following: | |||

rn<sub>n</sub> = rn<sub>n-A</sub> XOR rn<sub>n-B</sub> XOR rn<sub>n-C</sub> XOR rn<sub>n-D</sub> | |||

In this particular case (K=4), the period is ~2<sup>1000</sup>. | |||

[[Category:Articles]] | [[Category:Articles]] |

## Revision as of 15:50, 30 January 2010

A **random number generator**, or **RNG** for short, is a special algorithm that returns a pseudo-random number, the next number of an infinite sequence deduced from a "seed" or initial number. Roguelike games use the RNG to compute dice rolls and in other cases that require random generation. Some RNG algorithms evaluate simple polynomials, while others use techniques derived from the fractal and chaos theories.

Many players refer to the RNG as the **Random Number God**, the entity inside the game that provides random functionality. This diety seems to leave great items and easy monsters for some players, while granting seemingly-unfair deaths to others. Some players superstitiously avoid offending the Random Number God.

# Algorithms

All RNG algorithms start with an initial *seed*. The algorithm uses the seed to compute its initial *state*. To compute the next random number, the algorithm applies some formula to calculate the next state from the current state, then it uses the state to produce a number from a certain range.

Some RNGs produce integers, while others produce floating-point. Most roguelikes want integers; conversion from floating-point is often easy as multiply and floor.

After very many uses, any RNG algorithm will *cycle* as it reaches a state equal to its initial state. After this, the RNG will begin to compute the exact sequence of random numbers again! A good RNG would have a very long *period* so that it practically never cycles.

If an eavesdropper (who does not know the seed) can use a sequence of outputs from the RNG to predict the next random number, then the RNG is not cryptographically secure. Conveniently, cryptographic security is not important for roguelikes.

## Seeding

Some RNGs seed themselves differently each time, but other RNGs use the same default seed every time unless you remember to seed them. If every game uses the same seed, then it will roll the same character and (if the opening map is random) start at the same map every time! You can observe this if you intentionally set a particular seed during testing, or if the game has a bug that prevents it from setting the seed. For example, SLASH'EM bug 1380333 was the result of a game, in some configurations, seeding one random number generator but using another.

A simple way to grow a different seed every game is to use the current time as a seed. This traditional method is barely enough to give the player a different experience every time.

## Desired features

Pseudorandom number generators, in order to be considered "good", must offer the following:

### Speed

PRNG algorithms have different speeds. A Linear Congruential Generator (X_{n+1} = (*a*X_{n}+*b*) mod *m*) is probably the fastest one available. Other algorithms will offer lower speeds due to a greater number of calculations required to generate a number.

### Uniformity

Regrettably, not all algorithms offer a uniform distribution. For instance, if we had a random number generator returning two-bit numbers (range 0-3), using it as a base for creating numbers from 0 to 2 (`X=RNG4()%3;`

) would make the value of 0 appear twice as often as any other. Also, congruential generators have a non uniform distribution.

### Periodicity

A PRNG's period is the number of random values needed to be generated in order to start repeating the same number progression again. Fortunately, a short period of 2^{32} is usually enough for most uses.

# RNG in programming languages

Most programming environments provide an RNG. Typically, a game will:

- seed the RNG, if necessary
- provide its own functions to access the RNG, so that
- one can change to a different RNG later
- one can convert the RNG's output to the correct range

## C and C++

In C and C++, the random number generator is the `rand()` function. One sets the seed with `srand()`. Some call these the "ANSI C" (or equivalently "ISO C") functions because they are part of the C standard and to distinguish them from other RNGs that some systems provide.

The rand() function returns an int in the range from 0 to RAND_MAX, a platform-dependent value. Here is a simple program to show one random value:

#include <stdlib.h> /* rand, srand */ #include <stdio.h> /* printf (for this example) */ #include <time.h> /* time */ int main() { srand( time(NULL) ); printf("%d\n", rand()); return 0; }

Many programmers like to seed srand() with the return value of `time()`, as we do above. In fact, if we run this program multiple times within one second, it may print the same number again. (Some will write "srand( time(0) )", which is considered bad by some because the time function takes a pointer. Using NULL reminds you that it is a pointer. However, using 0 does the same thing, because NULL is almost always #defined to be 0. (See http://www.lysator.liu.se/c/c-faq/c-1.html#1-3.) If you want, you can add a cast to unsigned int. Adding it is necessary to satisfy lint.)

Now we come to an important note: **Many platforms have poor-quality versions of the rand() function.** Above GNU platforms, rand() and srand() work relatively well, and *The GNU C Library Reference Manual* recommends their use. If your favorite platform is GNU or Linux, then you could program with rand() and srand() and have your game at least working above other C platforms. But the most popular roguelike games support many platforms well, and their developers avoid rand() when they can.

BSD platforms implement rand() and srand() in terms of `rand_r()`, which limits the state of the RNG to 32 bits (implying a period of 2**31-1 or less). BSD calls this a "bad random number generator", while GNU states that 32 bits is "far too few to provide a good RNG."

Meanwhile, though RAND_MAX is platform dependent, BSD and GNU use 2,147,483,647, the largest possible signed int. One should be warned that on MSVC, RAND_MAX is 32,767, which may be a lot smaller than you expect.

If in doubt, you should implement a new RNG. However, you should **not** try to create your own algorithm, as it would almost certainly be even worse than the one you're trying to replace! Many people think the Mersenne twister is best, and there are existing implementations of it.

## Java

On the other hand, in Java the commonly used RNG resides in the Random class, which uppon initialized with a seed or the system time, returns pseudo-aleatory values of different primitive types. For example, you may invoke the method nextDouble() to get a value from 0.0D to 1.0D, which you may then scale to any range you need.

You may also use the rand() method of the "Math" utility class; however it is recommended to have your seedable RNG as a method in your own utility class so that the implementation of it may be changed without pain.

Most RL projects encapsulate or wrap the RNG functionality to allow exchangability of algorithms

# Criticisms

Pseudorandom number generators, or PRNGs, struggle against difficulties in generating numbers in as few CPU ticks as possible, while retaining a decent periodicity and quality. George Marsaglia created an application used for testing the quality of PRNG algorithms, called Diehard. He then brought up criticisms against many commonly used algorithms, mainly the Multiplicative Congruential Generator (used in most rand() function implementations). His comments can be found, for instance, on comp.lang.c Usenet group.

As Marsaglia has proven, MCGs create non-uniform pseudorandom numbers, falling into parallel hyperplanes. One of the examples of this undesired behaviour is the infamous RANDU. Additionally, most PRNGs of this type are predictable, as it is enough to observe 2 or 3 subsequently generated numbers to discover all the others. Also, this algorithm group has an undesired tendency to offer extremely short periods for lower order bits of the generated numbers (for instance, the last bit often has a period of 2^{0}: it simply alternates between 0 and 1).

The Mersenne twister was also criticised for being difficult to implement, even though it generates fast and high quality numbers. It also stores the last 624 generated numbers. Knowing this sequence can reveal all future iterates. This makes MT unsuitable for cryptography, although such a criticism does not apply in case of roguelike games.

A PRNG's periodicity is often te subject of criticisms as well. While congruential generators have a periodicity at most equal to 2^{32} on most 32 bit platforms (althought there ar examples: Java implements such a generator with a period of 2^{48}, and Donald Knuth's MMIX implementation has a period of ^{64}), they often offer a ridiculously low `RAND_MAX`

value, shortening the period to 2^{15} (for instance, MSVC). Such a low periodicity is unacceptable in many cases.

# RNG algorithms

Several RNG algorithms have been devised. Here is a brief list:

## Mersenne Twister

A commonly used PRNG is the Mersenne twister, a.k.a. MT19937, a recent, fast and algorithm of high quality. See [1] for more info about it. It has proven to pass George Marsaglia's Diehard PRNG tests and is commonly recommended as an excellent choice. It has a long period of exactly 2^{19937}-1, from which the algorithm derives its name (2^{19937}-1 is a Mersenne prime).

## Linear Congruential Generator

Avoided by many due to its deficiencies, it can still be useful for some operation where neither periodicity nor uniform distribution aren't required.

## Multiply-With-Carry

Excellent PRNG due to its simplicity, superior speed, high quality random numbers and long periods (a MWC256, storing the last 256 iterates, has a period of 2^{8222}).

## Complementary Multiply-With-Carry

Suggested by George Marsaglia, this PRNG is very fast, simple, generates very high quality numbers and has an extreme period. The CMWC4096, described here, stores the last 4096 iterates and has a near-record period of 2^{131104}. In libtcod, this algorithm has replaced the MT19937 as the default PRNG.

## Generalised Feedback Shift Register

Also known as GFSR, it is a very fast generator with good randomness and relatively high periods. The basic concept is the following:

rn_{n}= rn_{n-A}XOR rn_{n-B}XOR rn_{n-C}XOR rn_{n-D}

In this particular case (K=4), the period is ~2^{1000}.