0

Possible Duplicate:
Create Random Number Sequence with No Repeats

I'd like to write an URL shortener that only uses numbers as short string.

I don't want to count up, I want the next new number to be random (or pseudo random).

At first thought algorithm would then look like this (pseudo code):

do 
{
 number = random(0,10000)
}
while (datastore.contains(number))

datastore.store(number, url)

The problem with this implementation is: As the datastore contains more numbers, the more likely it is that the loop will be executed multiple times. The performance will decrease over time.

Isn't there a better way to get a random number that is not already in use?

Community
  • 1
  • 1
leifg
  • 8,668
  • 13
  • 53
  • 79
  • 1
    Related: http://stackoverflow.com/questions/693880/create-random-number-sequence-with-no-repeats – Thilo Nov 10 '11 at 11:11
  • 1
    Take a look at [Unique random numbers in O(1)?](http://stackoverflow.com/questions/196017/unique-random-numbers-in-o1) – Gumbo Nov 10 '11 at 11:12
  • Also note that if you use short numbers instead of longer UUID, those numbers become guessable, i.e. people can see the URL that other people registered by just trying a few numbers. That may or may not be a problem. – Thilo Nov 12 '11 at 02:56

3 Answers3

5

1) fill an array with sequential values

2) shuffle the array

justin
  • 104,054
  • 14
  • 179
  • 226
1

Use an encryption. Since encryption is reversible, unique inputs generate unique outputs. For 64 bit numbers use a cypher with a 64 bit blocksize. For smaller block sizes, such as 32 bit or 16 bit, have a look at the Hasty Pudding Cypher.

Whatever block size you need, just encrypt the numbers 0, 1, 2, ... (in the appropriate block size) to generate as many unique non-sequential numbers as you need.

rossum
  • 15,344
  • 1
  • 24
  • 38
  • If the random numbers have to be unique and in range 0 to 10000 as in problem, encryption doesn't solve the problem. – James Waldby - jwpat7 Nov 10 '11 at 17:20
  • 1
    If you read the description of Hasty Pudding cypher carefully, then you will see that it does. "The Hasty Pudding cipher can also be used to encrypt values in a range that do not translate to strings with an integral number of bits; for instance, it can encrypt a number from 0 to N by producing another number from 0 to N. It does this by using the smallest subcipher that can handle the input as a bit string, and applying it to the input as a bit string, repeatedly, until the output is in the proper range." – rossum Nov 10 '11 at 21:46
0

Some related questions: # 2394246, # 54059, # 158716, # 196017, and # 1608181.

The proper approach depends on how many numbers you will generate and on if realtime performance is required. If you draw no more than a small fraction of the numbers available in a range, average time per number for your code snippet is O(1), with slight increase of time per later number but still O(1). See, for example, question #1608181 answer in which I show that getting k numbers from a range of more than 2*k numbers with such code is O(k). (That answer also has C code to generate M numbers from a range of N numbers, in O(M) time when M<N/2, and explains how to use it for O(M) time when M>=N/2.)

If you want O(1) performance with a hard time limit, you can use the program just mentioned to pre-load an array, or can shuffle the whole range of integers, as mentioned by Justin. After that preprocessing, each access is O(1). Buf if you know you won't draw more than say 3000 numbers from your 1...10000 range, and don't have a hard time limit, the code you have will run in O(1) time on average, with probability of k passes decreasing like 0.3 ^ k; i.e., at worst about 70% chance of 1 pass, 21% for 2, 6% for 3, 2% for 4, 0.6% for 5, and so forth.

Community
  • 1
  • 1
James Waldby - jwpat7
  • 8,593
  • 2
  • 22
  • 37