3

I have the following code

number_list = (i for i in range(5))
permutations = (num for num in itertools.product(number_list, repeat=9))

This is generating an iterator called permutations which will hold all permutations of 9 characters within the number_list if I'm not mistaken. This can get pretty large for a big number_list.

I can iterate through permutations with next(permutations) but the problem is that it's sequencial. I would like to be able to draw a random item from any part of the iterator. If it was a list, I could simply do random.choice() but for a big number_list I don't have near enough memory nor time for that.

I could also just use next() and store a list of X amount of items and them randomize them but that won't work either because it can get so incredibly big that the outputs would be so similar it wouldn't really be "random".

I was wondering, if it isn't possible to draw a random item from the iterator, is there an algorithm which allows me to create an iterator which will output a random set with next() but that when it ends it will have gone through the entire permutations witout repeating?

The final idea would be having an iterator that would spit a random permutation of n characters out of a list of i elements, being able to get both n and i to arbitrarily large numbers without memory constraints, making sure that when the whole iterator ends up finishing (doesn't matter when, even if it finished after years in theory), all possible permutations would be exhausted without repetitions.

martineau
  • 119,623
  • 25
  • 170
  • 301
Cromen
  • 85
  • 1
  • 6
  • Not exactly related, but why `number_list = (i for i in range(5))` and why not `number_list = range(5)`??? – juanpa.arrivillaga May 25 '20 at 20:27
  • Because I'm a bit of an idiot, thank you :D – Cromen May 25 '20 at 20:43
  • I think the answer is basically "no" if you want to iterate through the product without repeating. You will either need a sample space up front to shuffle, or you will need to keep track of what you have already seen. Either way you will need to keep something in memory that is proportional to the number of values. – Mark May 25 '20 at 20:56
  • 1
    Have a look at https://stackoverflow.com/questions/49956883/efficient-random-generator-for-very-large-range-in-python – Thierry Lathuille May 25 '20 at 21:34
  • @ThierryLathuille That's a very interesting post. I'm still struggling with the fact that I can't make it guarantee that all permutations would show up, and without repetitions. Would be nice if we could have some mathematical function that would spit out all permutations with a seemingly-random nature, of the whole set. – Cromen May 26 '20 at 09:30
  • The main idea isn't really to achieve true randomness, the idea is to basically shuffle the whole list and print sequencially, but without having to actually shuffle the list because 6**12 items in a list can be quite memory and cpu intensive to do. – Cromen May 26 '20 at 09:32
  • Because for example, let's say a generator which sequencially generates all permutations of numbers 0-9 in length 3. You'd have 000, 001, 002, ..., 999. Now, let's say, for example, that you want to test which number matches a certain condition and you have to test all of them until it hits. You know the solution will most likely not be either in the end or beginning and somewhere in the middle, so starting at 000, 001, 002, 003, 004, is slightly less efficient than having a function that would produce shuffled results, 322, 467, 754, 833, 002, etc. It's a huge difference for big sizes. – Cromen May 26 '20 at 09:37

1 Answers1

-1

Firstly, your code does not generate permutations, but draws with replacement. Secondly, iterators (as the name suggests) are meant to ITERATE through some collection, and not to jump to random places in it (of course, you can write your own __next__ function which does whatever you want - whether you want to call the resulting object an iterator is a philosophical question). Thirdly, producing random samples with replacement is a much studied and implemented question. See for example: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.choice.html

Igor Rivin
  • 4,632
  • 2
  • 23
  • 35