1

This was inspired by a question at a job interview: how do you efficiently generate N unique random numbers? Their security and distribution/bias don't matter.

I proposed a naive way of calling rand() N times and eliminating dupes by trial and error, thus getting inefficient and flawed solution. Then I've read this SO question, these algorithms are great for getting quality unique numbers and they are O(N).

But I suspect there are ways to get low-quality unique random numbers for dummy tasks in less than O(N) time complexity. I got some possible ideas:

  • Store many precomputed lists each containing N numbers and retrieve one list randomly. Complexity is O(1) for fixed N. Storage space used is O(NR) where R is number of lists.
  • Generate N/2 unique random numbers and then divide them by 2 inequal parts (floor/ceil for odd numbers, n+1/n-1 for even). I know this is flawed (duplicates can pop up) and O(N/2) is still O(N). This is more of a food for thought.
  • Generate one big random number and then squeeze more variants from it by some fixed manipulations like bitwise operations, factorization, recursion, MapReduce or something else.
  • Use a quasi-random sequence somehow (not a math guy, just googled this term).

Your ideas?

Community
  • 1
  • 1
pati
  • 135
  • 1
  • 7
  • Was their complaint that your solution's big-O complexity was too high, or just that it was too inefficient? Because there are more efficient ways to generate _unique_ numbers in a given range than trial and error. Also, your solution isn't O(n) worst case - to generate a random permutation of n from n, it'll run in O(n^2) time due to the number of retries required. – Nick Johnson Apr 11 '12 at 04:45

2 Answers2

6

Presumably this routine has some kind of output (i.e. the results are written to an array of some kind). Populating an array (or some other data-structure) of size N is at least an O(N) operation, so you can't do better than O(N).

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • Thanks for the explanation, I modified a question accordingly. – pati Apr 08 '12 at 17:19
  • 1
    @pati- You are probably better off accepting Oli's answer, then posting a second question. That way, anyone searching for your original question can get it answered directly. – templatetypedef Apr 08 '12 at 17:24
0

You can consequently generate a random number, and if the result array contains it, just add to it the maximum number of already generated numbers.

Detecting if a number already generated is O(1) (using a hash set). So it's O(n) and with only N random() calls.

Of course, this is an assumption that we do not overflow the upper limit (i.e. BigInteger).

Eugene Retunsky
  • 13,009
  • 4
  • 52
  • 55
  • Yep, that's nice and simple O(N) solution similar to [Floyd algorithm](http://stackoverflow.com/a/1608585/1320363) but with less 'math-proven distribution quality', I guess. – pati Apr 08 '12 at 17:31
  • The distribution is not a concern from your question: "Their security and distribution/bias don't matter." – Eugene Retunsky Apr 08 '12 at 17:43