3

I need to do a linear congruential generator that will successfully pass the selected statistical tests.

My question is: how to choose numbers for the generator properly and which statistical tests should I choose?

I thought about:

  1. Chi-Square Frequency Test for Uniformity

    • Collect 10,000 numbers per generation method

    • Sub-divide[0.1) into 10 equal subdivisions

  2. Kolmogorov-Smirnov Test for uniformity

    • Since K-S Test works better with a smaller set of numbers, you may use the first 100 out fo the 10,000 that you generated for the Chi-Square Frequency Test

Here is the code example:

def seedLCG(initVal):
    global rand
    rand = initVal

def lcg():
    a = 1664525
    c = 1013904223
    m = 2**32
    global rand
    rand = (a*rand + c) % m
    return rand

seedLCG(1)

for i in range(1000):
    print (lcg())

when it comes to choosing seeds, I was thinking about nanoseconds, but I have no idea how to implement it and will it make sense at all? The idea is to show that the selected seeds were chosen randomly and not so much from the cap

tbone
  • 1,148
  • 5
  • 19
  • 35
  • 1
    This sounds like a homework, in which case the choices are *yours*, not ours. Wikipedia has a [page](https://en.wikipedia.org/wiki/Linear_congruential_generator) with rules for picking coefficients, and a table of commonly used ones. For tests, the two you mention are fairly standard. See the [Diehard tests](https://en.wikipedia.org/wiki/Diehard_tests) if you need more alternatives. For seeding, `time.time_ns()` is available in Python 3.7. Finally, check out https://www.sciencedirect.com/science/article/pii/0167637786900921 for a test which fails LCGs that return their full seed. – pjs Jul 11 '19 at 13:50
  • Do you think that the choice of initial parameters using **time.time_ns ()** is a good solution or it is better to choose something from standard (from the table)? – tbone Jul 11 '19 at 14:33
  • Great article. Thank you! – tbone Jul 11 '19 at 15:42
  • `time.time_ns()` would be used for seeding, not for choosing LCG parameterization. That's [how Java does it (see lines 114-135)](http://developer.classpath.org/doc/java/util/Random-source.html). – pjs Jul 11 '19 at 16:16
  • Which article struck your fancy? I had several links there... – pjs Jul 11 '19 at 16:16
  • sciencedirect.com/science/article/pii/0167637786900921 - it helped me understand it better – tbone Jul 11 '19 at 16:23
  • @pjs do you have any information regarding the implementation of diehard tests in python? – tbone Jul 16 '19 at 17:57
  • They're a suite of tests which are implemented in C and FORTRAN for speed, available [here](https://web.archive.org/web/20160125103112/http://stat.fsu.edu/pub/diehard/). The idea was that they should be independent of the language used to generate the data being tested. You would have to port them to Python unless you can find somebody who has done so already. – pjs Jul 16 '19 at 19:16

1 Answers1

3

Wrt how to choose numbers for the generator properly, in Wiki page there is a description of Hull–Dobell Theorem which tells you how to pick a and c to have full period generator. You got your numbers from Numerical Recipes, and as far as I could tell you'll get full period [0...232) generator. Or you could look at Figure of Merit from this paper, there are (a,c) pairs for any size of period desirable.

Concerning tests, take a look at paper @pjs provided.

when it comes to choosing seeds, I was thinking about nanoseconds, but I have no idea how to implement it and will it make sense at all? The idea is to show that the selected seeds were chosen randomly and not so much from the cap. This is, I think, is not a good idea because you cannot guarantee seeds you picked from time/ceil/... won't overlap. LCG is basically bijective [0...232)<->[0...232) mapping, and it is relatively easy to overlap streams of random numbers so your result(s) are correlated.

Instead I would propose to use another nice property of the LCG - logarithmic skip ahead (and back). So for simulating on N cores you could just pick single seed and run on 1st code, same seed but skip(N/232) for second core, seed and skip(N/232 * 2) and so on and so forth.

Code for LCG with explicit state and skip is below, Win10 x64, Python 3.7 Anaconda

import numpy as np

class LCG(object):

    UZERO: np.uint32 = np.uint32(0)
    UONE : np.uint32 = np.uint32(1)

    def __init__(self, seed: np.uint32, a: np.uint32, c: np.uint32) -> None:
        self._seed: np.uint32 = np.uint32(seed)
        self._a   : np.uint32 = np.uint32(a)
        self._c   : np.uint32 = np.uint32(c)

    def next(self) -> np.uint32:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint32:
        return self._seed

    def set_seed(self, seed: np.uint32) -> np.uint32:
        self._seed = seed

    def skip(self, ns: np.int32) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint32 = np.uint32(ns)

        a: np.uint32 = self._a
        c: np.uint32 = self._c

        a_next: np.uint32 = LCG.UONE
        c_next: np.uint32 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next


#%%
np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)

print(rng.next())
print(rng.next())
print(rng.next())

rng.skip(-3) # back by 3
print(rng.next())
print(rng.next())
print(rng.next())

rng.skip(-3) # back by 3
rng.skip(2) # forward by 2
print(rng.next())

UPDATE

Generating 10k numbers

np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)
q = [rng.next() for _ in range(0, 10000)]
tbone
  • 1,148
  • 5
  • 19
  • 35
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
  • thank you it sounds good. Although the code is complicated for me but I will try to understand it and I don't quite understand how to generate e.g 10k numbers for it – tbone Jul 15 '19 at 16:06
  • @tbone Please check update how to generate 10k numbers with it. – Severin Pappadeux Jul 15 '19 at 16:43
  • Should not it be rng.next instead of LCG.next ?? – tbone Jul 15 '19 at 17:55
  • @tbone where is LCG.next? Failed to see it in my code – Severin Pappadeux Jul 15 '19 at 18:44
  • Did not we have to call this method on the object? (rng.next () instead of LCG.next) with LCG.next () I got this error: next () missing 1 required positional argument: 'self'. I got such an error with the original code but when it comes to editing, I can be wrong – tbone Jul 15 '19 at 18:47
  • @tbone LCG is a class, and next() is not a class (aka static) method. If you want to use LCG as iterator/generator, you have to make it explicit, see https://stackoverflow.com/questions/42983569/how-to-write-a-generator-class for details. As written, you have to instantiate object of LCG class and explicitly call .next() method for that object. – Severin Pappadeux Jul 15 '19 at 19:20
  • I'm trying to create an instance but it throws me errors. what happens when it is rng.next ()? – tbone Jul 15 '19 at 20:23
  • Are you sure there should be LCG.next ()? With rng.next () it looks like it works. with LCG.next () I try as I can and it does not work – tbone Jul 15 '19 at 20:37
  • It won't work with LCG.next(), it is not a class method. So you could apply call .next() to the instance, not to he class. So rng.next() is the right thing to do – Severin Pappadeux Jul 15 '19 at 20:43