I am trying to understand the correct usage of parallel random number generation. After having consulted different resources, I wrote a simple code that seems to work, but it would be nice if someone could confirm my understanding.
For the sake of pointing out the difference and relationship between rand() and rand_r(), let's solve:
Produce a random integer N, then extract N random numbers in parallel and compute their average.
This is my proposal (checking and free omitted), small integers on purpose:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>
int main() {
/* Initialize and extract an integer via rand() */
srand(time(NULL));
int N = rand() % 100;
/* Storage array */
int *extracted = malloc(sizeof(int) * N);
/* Initialize N seeds for rand_r, which is completely
* independent on rand and srand().
* (QUESTION 1: is it right?)
* Setting the first as time(NULL), and the others
* via successive increasing is a good idea (? QUESTION 2)*/
unsigned int *my_seeds = malloc(sizeof(unsigned int) * N);
my_seeds[0] = time(NULL);
for (int i = 1; i < N; ++i) {
my_seeds[i] = my_seeds[i - 1] + 1;
}
/* The seeds for rand_r are ready:
* extract N random numbers in parallel */
#pragma omp parallel for
for (int i = 0; i < N; ++i) {
extracted[i] = rand_r(my_seeds + i) % 10;
}
/* Compute the average: must be done sequentially, QUESTION 3,
* because of time-sincronization in reading/writing avg */
double avg = 0;
for (int i = 0; i < N; ++i) {
avg += extracted[i];
}
avg /= N;
printf("%d samples, %.2f in average.\n", N, avg);
return 0;
}
As my comments in the code try to highlight, it would be helpful to understand if:
the simultaneous usage of rand and rand_r is in this case correct;
the seed's initialization for rand_r, i.e. the variable my_seeds, is fine;
the for parallelization and related variable usage is safe.
I hope to sum up various doubts in a single, simple, ready-to-use example, after having read various tutorials / sources online (this website included).