8

I'm generating a 6 digit code from the following characters. These will be used to stamp on stickers.
They will be generated in batches of 10k or less (before printing) and I don't envisage there will ever be more than 1-2 million total (probably much less).
After I generate the batches of codes, I'll check the MySQL database of existing codes to ensure there are no duplicates.

// exclude problem chars: B8G6I1l0OQDS5Z2

$characters = 'ACEFHJKMNPRTUVWXY4937';

$string = '';

for ($i = 0; $i < 6; $i++) {
    $string .= $characters[rand(0, strlen($characters) - 1)];
}   

return $string;
  1. Is this a solid approach to generating the code?
  2. How many possible permutations would there be? (6 Digit code from pool of 21 characters). Sorry math isn't my strong point
CSᵠ
  • 10,049
  • 9
  • 41
  • 64
Quad6
  • 385
  • 2
  • 5
  • 19
  • there are some good posts here on various approaches: http://stackoverflow.com/questions/1846202/php-how-to-generate-a-random-unique-alphanumeric-string –  May 09 '13 at 23:38
  • If you don't need more than 8 million codes and short and unambiguous codes is a high priority I'd say you have a good way to do it. – SEngstrom May 09 '13 at 23:39

6 Answers6

14

21^6 = 85766121 possibilities.

Using a DB and storing used values is bad. If you want to fake randomness you can use the following:

Reduce to 19 possible numbers and make use of the fact that groups of order p^k where p is an odd prime are always cyclic.

Take the group of order 7^19, using a generator co-prime to 7^19 (I'll pick 13^11, you can choose anything not divisible by 7).

Then the following works:

$previous = 0;

function generator($previous)
{

  $generator = pow(13,11);
  $modulus = pow(7,19); //int might be too small
  $possibleChars = "ACEFHJKMNPRTUVWXY49";

  $previous = ($previous + $generator) % $modulus;
  $output='';
  $temp = $previous;

  for($i = 0; $i < 6; $i++) {
    $output += $possibleChars[$temp % 19];
    $temp = $temp / 19;
  }

  return $output;
}

It will cycle through all possible values and look a little random unless they go digging. An even safer alternative would be multiplicative groups but I forget my math already :(

Jean-Bernard Pellerin
  • 12,556
  • 10
  • 57
  • 79
  • 1
    The generated codes need to be stored as they are associated with stamps (products) which will have info associated with them. If I generate 10k of codes then a month later another 10k, I need to somehow check that there are no duplicates in this new 10k batch. – Quad6 May 10 '13 at 00:00
  • I've posted how to ignore duplicates, but if you're storing additional info anyways disregard this answer. – Jean-Bernard Pellerin May 10 '13 at 00:03
  • 2
    @Jean-BernardPellerin php-ized you code, hope you don't mind, please check to see if everything is allright – CSᵠ May 10 '13 at 00:19
  • will this be unique every time or do i need to check that manually ? and i don't want it to be from 21 chars only , anything will ok for me. – keen Sep 24 '13 at 06:11
  • how to use this answere – PHP_USER1 Jan 04 '14 at 17:09
  • 1
    6^19 or 19^6? 6 is not an odd prime, so I'm assuming you meant the latter. In which case your choice of generator is not co-prime to it (19^11) - 11^19 looks OK. I was looking for 5-character codes and arrived at this answer; I am using 23^5 (removed I, O and Q from the alphabet) as modulus and 19^11 for the generator which seems to work for me. – FreeBird Dec 29 '14 at 08:49
  • it should be $output .= $possibleChars[$temp % 19]; – yudijohn Oct 27 '15 at 07:12
7
  • There is a lot of possible combination with or without repetition so your logic would be sufficient
  • Collision would be frequent because you are using rand see str_shuffle and randomness.
  • Change rand to mt_rand
  • Use fast storage like memcached or redis not MySQL when checking

Total Possibility

21 ^ 6 = 85,766,121

85,766,121 should be ok , To add database to this generation try:

Example

$prifix = "stamp.";

$cache = new Memcache();
$cache->addserver("127.0.0.1");

$stamp = myRand(6);
while($cache->get($prifix . $stamp)) {
    $stamp = myRand(6);
}
echo $stamp;

Function Used

function myRand($no, $str = "", $chr = 'ACEFHJKMNPRTUVWXY4937') {
    $length = strlen($chr);
    while($no --) {
        $str .= $chr{mt_rand(0, $length- 1)};
    }
    return $str;
}
Community
  • 1
  • 1
Baba
  • 94,024
  • 28
  • 166
  • 217
  • Thanks. 86 mill is sufficient. I will use your suggestion of mt_rand. I'm not sure how memcached can be used to check that codes in a newly generated batch don't already existing in MySQL (containing all previously generated batches of codes). For printing purposes, I imagine generating 10k or so codes at a time – Quad6 May 10 '13 at 00:02
3

You would have 21 ^ 6 codes = 85 766 121 ~ 85.8 million codes!

To generate them all (which would take some time), look at the selected answer to this question: algorithm that will take numbers or words and find all possible combinations.

Community
  • 1
  • 1
silkfire
  • 24,585
  • 15
  • 82
  • 105
3

as Baba said generating a string on the fly will result in tons of collisions. the closer you will go to 80 millions already generated ones the harder it will became to get an available string

another solution could be to generate all possible combinations once, and store each of them in the database already, with some boolean column field that marks if a row/token is already used or not

then to get one of them

SELECT * FROM tokens WHERE tokenIsUsed = 0 ORDER BY RAND() LIMIT 0,1

and then mark it as already used

UPDATE tokens SET tokenIsUsed = 1 WHERE token = ...
  • Thanks, I would have thought there'd be quite a performance hit if storing that 80 mill combinations, in terms of queries etc. The code will be looked up not just stored as a log. – Quad6 May 10 '13 at 00:31
  • 2
    I think you could combine this approach with @Jean-Bernard's cyclic generation and make your query plenty fast - fill up the database with all the combinations in a somewhat-random order, and each time you need new values you can grab them bunches at a time. Might be worth a try, anyway. – Jerry May 10 '13 at 00:56
0

I had the same problem, and I found very impressive open source solution:

http://www.hashids.org/php/

You can take and use it, also it's worth it to look in it's source code to understand what's happening under the hood.

Tamás Pap
  • 17,777
  • 15
  • 70
  • 102
-2

Or... you can encode username+datetime in md5 and save to database, this for sure will generate an unique code ;)