1

I'm looking for a mixing function that given an integer from an interval <0, n) returns a random-looking integer from the same interval. The interval size n will typically be a composite non power of 2 number. I need the function to be one to one. It can only use O(1) memory, O(1) time is strongly preferred. I'm not too concerned about randomness of the output, but visually it should look random enough (see next paragraph).

I want to use this function as a pixel shuffling step in a realtime-ish renderer to select the order in which pixels are rendered (The output will be displayed after a fixed time and if it's not done yet this gives me a noisy but fast partial preview). Interval size n will be the number of pixels in the render (n = 1920*1080 = 2073600 would be a typical value). The function must be one to one so that I can be sure that every pixel is rendered exactly once when finished.

I've looked at the reversible building blocks used by hash prospector, but these are mostly specific to power of 2 ranges.

The only other method I could think of is multiply by large prime, but it doesn't give particularly nice random looking outputs.

What are some other options here?

cube
  • 3,867
  • 7
  • 32
  • 52
  • `f(x) = (p*x+b) (mod n)` where `p` is a large prime with `gcd(p,n) = 1`? The `b` will make it seem more random. – John Coleman Mar 22 '19 at 13:29
  • I've tried it and it doesn't help much :/ – cube Mar 22 '19 at 13:36
  • How about using a generator which has a power of 2, using the smallest power of 2 which is `> n`? Throw away generated indices which exceed `n`. On average, you will throw away less than 1 out of every 2 outputs. If the generator is fast, it will still be fast if used this way. – John Coleman Mar 22 '19 at 13:39
  • Would that mapping remain one to one? – cube Mar 26 '19 at 13:11
  • Yes: For example, given a permutation `0, 4, 7, 1, 5, 3, 2, 6` of `0-7`, if you throw away values which exceed 5 you get a permutation `0, 4, 1, 5, 3, 2` of `0-5`. Think of this as a filter. A quasi-random way of going through 0-7 yields a quasi-random way for going through 0-5. There is some inefficiency going this route, but if the generator with a cycle which is a power of 2 is both fast and has a good distribution, then the performance should be acceptable. – John Coleman Mar 26 '19 at 14:08
  • I see, that would definitely work, but I can't really throw away any values because of how the GPU works. Each invocation of the function should be completely independent of all the others. – cube Mar 26 '19 at 15:50
  • So -- you want a computation that (unlike a typical random number generator) doesn't retain any state between calls. – John Coleman Mar 26 '19 at 15:59
  • But -- I don't see what you mean by "I can't really throw away values". Start the computation in state `i` and then use a while-loop until you get a value in the requisite range. – John Coleman Mar 26 '19 at 16:11
  • What I mean is that I'm going to calculate all values of the function (potentially) in parallel. Schedule n threads, each of them calculates `f(threadID)`. I can't control what order this happens in and I can't easily communicate between the individual threads. – cube Mar 27 '19 at 06:52
  • related questions: https://stackoverflow.com/questions/3910101/how-to-generate-a-predictable-shuffling-of-a-sequence-without-generating-the-who https://stackoverflow.com/questions/16416190/how-do-i-make-a-permutation-function-that-accepts-one-value-at-a-time – cube Mar 27 '19 at 13:10

1 Answers1

1

Here is one solution based on the idea of primitive roots modulo a prime:

If a is a primitive root mod p then the function g(i) = a^i % p is a permutation of the nonzero elements which are less than p. This corresponds to the Lehmer prng. If n < p, you can get a permutation of 0, ..., n-1 as follows: Given i in that range, first add 1, then repeatedly multiply by a, taking the result mod p, until you get an element which is <= n, at which point you return the result - 1.

To fill in the details, this paper contains a table which gives a series of primes (all of which are close to various powers of 2) and corresponding primitive roots which are chosen so that they yield a generator with good statistical properties. Here is a part of that table, encoded as a Python dictionary in which the keys are the primes and the primitive roots are the values:

d = {32749: 30805,
     65521: 32236,
     131071: 66284,
     262139: 166972,
     524287: 358899,
     1048573: 444362,
     2097143: 1372180,
     4194301: 1406151,
     8388593: 5169235,
     16777213: 9726917,
     33554393: 32544832,
     67108859: 11526618,
     134217689: 70391260,
     268435399: 150873839,
     536870909: 219118189,
     1073741789: 599290962}

Given n (in a certain range -- see the paper if you need to expand that range), you can find the smallest p which works:

def find_p_a(n):
    for p in sorted(d.keys()):
        if n < p:
            return p, d[p]

once you know n and the matching p,a the following function is a permutation of 0 ... n-1:

def f(i,n,p,a):
    x = a*(i+1) % p
    while x > n:
        x = a*x % p
    return x-1

For a quick test:

n = 2073600
p,a = find_p_a(n) # p = 2097143, a = 1372180
nums = [f(i,n,p,a) for i in range(n)]
print(len(set(nums)) == n) #prints True

The average number of multiplications in f() is p/n, which in this case is 1.011 and will never be more than 2 (or very slightly larger since the p are not exact powers of 2). In practice this method is not fundamentally different from your "multiply by a large prime" approach, but in this case the factor is chosen more carefully, and the fact that sometimes more than 1 multiplication is required adding to the apparent randomness.

John Coleman
  • 51,337
  • 7
  • 54
  • 119
  • Thank you for the answer, looks promissing. Isn't there an error in the definition of f? You are multiplying a by something, but probably should be doing an exponentiation. – cube Mar 27 '19 at 14:33
  • There is no error in `f()`. Exponentiation is just repeated multiplication. The idea is that `i+1` is equal to `a^k % p` for some `k` -- but you really don't need to know what that `k` is. `a*(i+1) % k` will be the same as `a^(k+1) % p`. Doing it this way is the key to doing it in a way where each computation can be done in parallel. – John Coleman Mar 27 '19 at 14:42
  • Wow. Works perfectly, Now I just need to wrap my head around it :-) – cube Mar 27 '19 at 14:50