0

For a package I am writing, I need unique numbers between 0 and 2**33 in a random order. Initially, I tried to use a Generator as follows:

def randomnumber(NUM):
    List = [i for i in range(NUM)]
    List.shuffle()
    index = 0
    while index < NUM:
          index += 1
          yield List[index-1]

But as NUM is 2**33 in my case, this code is just impossible. I have tried to write all the numbers to a text file from bash and found out the file is of size 93.6 GB (which is really huge and far more than my RAM). Then I am shuffling the contents of the file using terashuf and reading every line through it using linecache.

Also, I am using the multiprocessing module (apply_async in particular) and really need to pass this generator object as an argument. But python gave an error stating it can't use a generator object in pool processes. I went through few questions on SO and one it's answer is to create a list of these numbers from the generator for few numbers and pass them as arguments to the function running parallely, but that didn't work either.

So my question is there any way in which we can make create a generator which does the intended work (of giving random unique number between 0 and 2**33) or some other way to do this as I don't want to shuffle the contents of the file again and again (takes quite a lot of time)

Uday
  • 111
  • 6
  • Does this answer your question? [Create a random permutation of 1..N in constant space](https://stackoverflow.com/questions/10054732/create-a-random-permutation-of-1-n-in-constant-space) – Peter O. May 02 '20 at 17:30
  • I have looked at the question. It does give an algorithm but I want to know how this is done in python – Uday May 02 '20 at 18:43
  • Have you seen this question: https://stackoverflow.com/questions/49956883/efficient-random-generator-for-very-large-range-in-python/49957003 ? – Thierry Lathuille May 03 '20 at 14:54
  • @ThierryLathuille Yes that question helped me. I did some modifications to the algorithm proposed and am using that. Thanks for the comment! – Uday May 04 '20 at 04:25

1 Answers1

0

As I understand, the gist of your code is to generate a random integer number between 0 and NUM. And in your case, NUM is going to be 2**33.

The following code will do that and you can change NUM with no problem:

import math
import random


def generate_random(num):
    yield random.randint(0, num) 

#setting seed to get consistent results
random.seed(0)

# Now, let's use this simple function to generate
# 5 different random number between `0` and `2**33`:
NUM = math.pow(2, 33)
for i in range(10):
    print(next(generate_random(NUM)))

# This would print these five numbers
# 7921731533
# 1806341205
# 6490875490
# 6341935620
# 3900315155
Anwarvic
  • 12,156
  • 4
  • 49
  • 69
  • 1
    I need a code which produces a unique random number between 0 and NUM. Your doesn't exactly give unique numbers right? – Uday May 02 '20 at 16:03
  • Technically, the generated numbers aren't unique. But according to the mechanics of **pseudo-random generators** and how big the number is, they will be some-how unique. – Anwarvic May 02 '20 at 16:20
  • The question and the OP's code seem to indicate that he wants to shuffle the list of all numbers, and get each and everyone of them exactly once, just in random order. – Thierry Lathuille May 03 '20 at 14:40
  • @ThierryLathuille, Yeah, but to shuffle a list of that size `2^33` is very costly computational-wise. So, I tweaked the problem a bit. My code generates any amount of **not so unique** random numbers within the range of `0` and `NUM`. – Anwarvic May 03 '20 at 14:48