1

I know this may be a repost, but I have not found a satisfactory answer.

I have managed to generate integers from real numbers in range 0 to n using curand_uniform, but I would like to know if there is a better way to ensure the numbers are statistically uniformly distributed.

__global__ void generate_kernel(int n, curandState *state, int *result)
{   int id = blockIdx.x * blockDim.x + threadIdx.x;
    unsigned int x;
    if (id < n)
    {    curandState localState = state[id];            

         float aux = curand_uniform(&localState) * n;
         x = aux ;

         state[id] = localState;
         result[id] = x;
    }
}

So, is there some other thing I should use instead of the integer part of curand_uniform() multiplied by n? BTW, I have n threads, each one with its own different state and seed. Each thread generates one value and saves it on the results array.

talonmies
  • 70,661
  • 34
  • 192
  • 269
user3117891
  • 145
  • 2
  • 12

1 Answers1

0

You said that each thread has its own seed. It means each thread will have its own set of random numbers. So with your kernel, uniformity will be achieved by thread-basis, and to get the uniformly generated numbers you'll have to call the above kernel many times. However, if your intention is to call this kernel only once and to expect n uniformly distributed values, each of the n threads should have the same seed value, with different sequence or offset values. For detailed information with example code, you can look at this article.

Also, there's some subtlety with the return value of curand_uniform. From the cuRAND documentation ยง3.1.4, it says:

__device__ float
curand_uniform (curandState_t *state)

This function returns a sequence of pseudorandom floats uniformly distributed between 0.0 and 1.0. It may return from 0.0 to 1.0, where 1.0 is included and 0.0 is excluded.

And your code is:

unsigned int x = curand_uniform(&localState) * n;

Casting to an integer type is truncating towards zero(link). So theoretically, you will get n only if the value returned by curand_uniform is 1.0 (which is a rare case). However, you'll get k, (0 < k < n) when the value returned by curand_uniform (I'll denote this return value as y) is k/n <= y < (k+1)/n, and you'll get 0 if 0 < y < 1/n. So the generated numbers are not uniformly distributed from integers 0 to n.

But note that this is all theoretical explanation. I've posted some sample code. It simply does integer casting as your code and builds histogram to see how the numbers are distributed. I've posted the output of the program at the bottom, and you'll see that the histogram looks like uniformly distributed from integers 0 to n-1. The case where curand_uniform returning 1.0 seems to be a really rare case (not appearing for 100000 trials!).

Theoretically, you'll get uniformly distributed integers from 1 to n if you use ceiling function, since if (k-1)/n < y <= k/n, then ceilf(y * n) will be k for 1 <= k <= n.

unsigned int x = ceilf(curand_uniform(&localState) * n);

The code I posted also covers the above case. You can run it and see a result similar to this (n = 10 in this case):

Histogram for the generated random numbers (with casting)
 0 :   9993
 1 :   9926
 2 :  10138
 3 :  10131
 4 :   9980
 5 :   9967
 6 :   9979
 7 :  10054
 8 :   9921
 9 :   9911
10 :      0
Histogram for the generated random numbers (with ceiling)
 0 :      0
 1 :  10036
 2 :  10033
 3 :   9899
 4 :   9952
 5 :   9960
 6 :   9892
 7 :  10167
 8 :   9839
 9 :  10198
10 :  10024

Note also that state[id] = localState; is redundant in your code.

Community
  • 1
  • 1
nglee
  • 1,913
  • 9
  • 32