1

I have to generate a large number of unique keys. One key should consist of 16 digits. I came up with the following code:

function make_seed()
{
    list($usec, $sec) = explode(' ', microtime());
    return (float) $sec + ((float) $usec * 100000);
}
function generate_4_digits(){
    $randval = rand(100, 9999);
    if($randval < 1000){
        $randval = '0'.$randval;
    }
    return (string)$randval;
}
function generate_cdkey(){
    return generate_4_digits() . '-' . generate_4_digits() . '-' . generate_4_digits() . '-' . generate_4_digits();
}


srand(make_seed());
echo generate_cdkey();

The result was quite promising, 6114-0461-7825-1604. Then I decided to generate 10 000 keys and see how many duplicates I get:

srand(make_seed());
$keys = array();
$duplicates = array();
for($i = 0; $i < 10000; $i++){
    $new_key = generate_cdkey();
    if(in_array($new_key, $keys)){
        $duplicates[] = $new_key;
    }
    $keys[] = $new_key;
}
$keys_length = count($keys);
var_dump($duplicates);
echo '<pre>';
for($i = 0; $i < $keys_length; $i++){
    echo $keys[$i] . "\n";
}
echo '</pre>';

On first run I got 1807 duplicates which was quite disappointing. But for my great surprise on each following run I get the same number of duplicates!? When I looked closely at the generated keys, I realized the last 1807 keys were exactly the same as the first ones. So I can generate 8193 without a single duplicate?! This is so close to 2^13?! Can we conclude rand() is suited to generate maz 2^13 unique numbers? But why?

I changed the code to use mt_rand() and I get no duplicates even when generating 50 000 keys.

Martin Dimitrov
  • 4,796
  • 5
  • 46
  • 62
  • 2
    What's a cd key? And how about just using an existing UUID / GUID generator? – Evert Sep 13 '12 at 21:53
  • @Evert, I want to generate a key with 16 digits. 32 hex digits is quite large for my needs. – Martin Dimitrov Sep 13 '12 at 21:54
  • Tested the code, no duplicates here .. tested on PHP 5.3.6 – dbf Sep 13 '12 at 21:54
  • @dbf, hm, very strange. Let me try it on codepad.org – Martin Dimitrov Sep 13 '12 at 21:55
  • @dbf, I get timeouts on codepad.org. I am on Windows, PHP 5.3.8 – Martin Dimitrov Sep 13 '12 at 21:59
  • 3
    $randval = rand(100, 9999); if($randval < 1000){ $randval = '0'.$randval; } could be condensed to this: $randval = sprintf("%04d", rand(100, 9999)); – iandouglas Sep 13 '12 at 21:59
  • 3
    See if using `mt_rand` instead of `rand` makes any difference? Also, see this answer [How are software license keys generated?](http://stackoverflow.com/questions/3002067/how-are-software-license-keys-generated) – drew010 Sep 13 '12 at 22:00
  • @iandouglas, that looks much better. Thanks. – Martin Dimitrov Sep 13 '12 at 22:02
  • Any particular reason you're using just decimal numbers? You could include hex values (A-F) and expand each group of 4 as high as 65535 giving you way more options. $randval = sprintf("%04X", rand(256,65535)); Hex would also avoid ambiguity of 1's vs I's, 0's vs O's of using the entire alphabet. Even if you don't theoretically need that many, it might still help your overlap problem? – iandouglas Sep 13 '12 at 22:37

3 Answers3

1

Throw some uniquid() in there.

http://www.php.net/manual/en/function.uniqid.php

Rudolph Gottesheim
  • 1,671
  • 1
  • 17
  • 30
1

This is probably more what you're looking for. openssl_random_pseudo_bytes ( int $length [, bool &$crypto_strong ] )

FluffyJack
  • 1,732
  • 10
  • 15
  • To explain my thought process, you could specify a length of 8 (16 characters I think in bytes) and then add a '-' in every 4 characters. – FluffyJack Sep 13 '12 at 22:08
0

This might be something to do with the behaviour of srand. When checking for duplicates you are only running srand once for all 10000 keys. Perhaps srand only produces enough for ~2^13 keys? What PHP version are you using? Since 4.2.0 srand isn't needed any more, but perhaps in if you call it anyway it stops doing it automatically for the rest of the script.

gandaliter
  • 9,863
  • 1
  • 16
  • 23