0

I would like to have GUIDs in my app, in the shape of /[0-9]{32}/, so it could have 12131415... or 1020304010101010101..., each one exactly 32 characters long.

I would like for it to be evenly distributed over the set. By that I mean I don't want the IDs to be used in this order:

10000000000000000000000000000000
10000000000000000000000000000001
10000000000000000000000000000002
10000000000000000000000000000003
...

I would like for it to feel random:

91837183748159173751758138571830
18281849104849293849185918918905
...

The way I am thinking about doing this is by just generating all of the numbers up front, and then shuffling the array. But this won't work for large sets of strings like what I have here, it will run out of memory.

So I am not sure what to do instead.

I am doing this in Node.js and will save each GUID as a record to the database as a GUID schema/model with a property/attribute value which contains its actual ID, using Mongoose.

It seems the next best option is to randomly generate a string, and check if it's unique first before trying again. But I've tried this approach before and it seems to exponentially slow down the larger the set gets.

So I'm wondering what the general approach is to generating a large set of numbers like this so that you can pluck one at random and it feels random. I only currently need like 1 trillion of them, but I would like for it to be not 0-1000000000000, but 1 trillion out of the set of 0-99999999999999999999999999999999, seemingly evenly distributed.

Wondering how to accomplish that, either in my specific Node.js/JavaScript/Mongoose situation, or just generally in pseudocode or any procedural language.

Lokasa Mawati
  • 441
  • 5
  • 15
  • `But I've tried this approach before and it seems to exponentially slow down the larger the set gets.` It shouldn't - `Set.has` is `O(1)` (don't use an array). Also, `10**32` possibilities should be enough to make the chance of a collision infinitesimal – CertainPerformance May 07 '19 at 08:21
  • Just generate a trillion random numbers and not care about duplicates. Because your range is really really big compared to 1 trillion. 1 trillion divided by 99999999999999999999999999999999 is about 1e-20. – Sweeper May 07 '19 at 08:22
  • The reason it slows down is because you have to try more and more numbers as you use them up. So when you are at 99% used up, you generate a new random number, and it's likely to be used, then the next is likely to be used, etc. – Lokasa Mawati May 07 '19 at 08:22
  • @Sweeper I would like exactly 1 trillion numbers in this case. – Lokasa Mawati May 07 '19 at 08:28
  • As I said, 1 trillion is a _tiny_ number compared to 99999999999999999999999999999999. You will only use up 0.00000000000000001% of the available numbers. So don't worry about duplicates. – Sweeper May 07 '19 at 08:30
  • Oh ok, cool. I see. – Lokasa Mawati May 07 '19 at 08:34

0 Answers0