0

I need to generate long (pseudo)random arrays (1000-25 000 000 integers) where no element is repeated. How do I do it since rand() function does not generate numbers long enough?

I tried to use this idea: array[i] = (rand() << 14) | rand() % length; however I suppose there is much better way that I don't know.

Thank you for your help.

Allemis
  • 47
  • 1
  • 10

1 Answers1

0

You can use the Fisher-Yates shuffle for this.

Create an array of n elements and populate each element sequentially.

-------------------------
| 1 | 2 | 3 | 4 | 5 | 6 |
-------------------------

In this example n is 6. Now select a random index from 0 to n-1 (i.e. rand() % n) and swap the number at that index with the number at the top of the array. Let's say the random index is 2. So we swap the value at index 2 (3) and the one at n-1 (6). Now we have:

                      v
-------------------------
| 1 | 2 | 6 | 4 | 5 | 3 |
-------------------------

Now we do the same, this time with the upper bound of the index being n-2. Then we swap the value at that index with the value at index n-2. Let's say time we randomly get 0. So we swap index 0 (1) with index n-2 (5):

                  v
-------------------------
| 5 | 2 | 6 | 4 | 1 | 3 |
-------------------------

Then repeat. Let's say the next random index is 3. This happens to be our upper limit, so no change:

              v
-------------------------
| 5 | 2 | 6 | 4 | 1 | 3 |
-------------------------

Next we get 0:

          v
-------------------------
| 6 | 2 | 5 | 4 | 1 | 3 |
-------------------------

And finally 1:

      v
-------------------------
| 6 | 2 | 5 | 4 | 1 | 3 |
-------------------------
dbush
  • 205,898
  • 23
  • 218
  • 273
  • Thank you very much. Then I suppose I should generate an array containing values from 1 up to n (n is from 1000 up to 25 000 000) and shuffle it. The problem I still see is to generate a random number up to 25 000 000 since `RAND_MAX` on my system is 32767. – Allemis Apr 21 '17 at 16:37
  • @Allemis In that case you'll need to call `rand()` twice and shift/OR, i.e. `((rand() << 15) | rand()) % length` – dbush Apr 21 '17 at 16:39