Here's an example of initializing an array of ten million random numbers, using a list (a
), and using tuple-like generator (b
). The result is exactly the same, the list or tuple is never used, so there's no practical advantage with one or the other
from random import randint
from array import array
a = array('H', [randint(1, 100) for _ in range(0, 10000000)])
b = array('H', (randint(1, 100) for _ in range(0, 10000000)))
So the question is which one to use. In principle, my understanding is that that a tuple should be able to get away with using less resources than a list, but since this list and tuple are not kept, it should be possible that the code is executed without ever initializing the intermediate data structure… My tests indicate that the list is slightly faster in this case. I can only imagine that this is because the Python implementation has more optimization around lists than tuples. Can I expect this to be consistent?
More generally, should I use one or the other, and why? (Or should I do this kind initialization some other way completely.)
Update: Answers and comments made me realize that the b
example is not actually a tuple but a generator, so I edited a bit in the headline and the text above to reflect that. Also I tried splitting the list version into two lines like this, which should force the list to actually be instantiated:
g = [randint(1, 100) for _ in range(0, 10000000)]
a = array('H', g)
It appears to make no difference. The list version takes about 8.5 seconds, and the generator version takes about 9 seconds.