I need to select 3.7*10^8
unique values from the range [0, 3*10^9]
and either obtain them in order or keep them in memory.
To do this, I started working on a simple algorithm where I sample smaller uniform distributions (that fit in memory) in order to indirectly sample the large distribution that really interests me.
The code is available at the following gist https://gist.github.com/legaultmarc/7290ac4bef4edb591d1e
Since I'm having trouble implementing something more robust, I was wondering if you had other ideas to sample unique values from a large discrete uniform. I'm looking for either an algorithm, a module or an idea on how to manage very large lists directly (perhaps using the hard drive instead of memory).