0

I'm using Laravel 5.1 framework.

I need to create a set of uniques codes from characters 'BCDFGHIJKLMNPQRSTVWXYZ' take 5, for example DFGHJ.

So, my code is this:

$maxCodes = 1200000;
$codeSize = 5;
$characters = 'BCDFGHJKLMNPQRSTVWXY';
$cLenght = strlen($characters);
$sizeBlock = $maxCodes/($cLenght*4); // For not overflow memory when put the codes in array

for ($i=0; $i < $cLenght; $i++) {
    $subCharacters = str_replace($characters[$i], '', $characters);
    $existCodes = [];
    for ($k = 0; $k < 4; $k++) {
        $codesToInsert = [];
        for ($j = 0; $j < $sizeBlock;) {
            $code = $characters[$i] . substr(str_shuffle($characters), 0, $codeSize-1); 

            if (!isset($existCodes[$code])) {
                $existCodes[$code] = '';
                array_push($codesToInsert, ['code' => $code, 'used' => 0, 'rand' => rand(0,10)]);
                $j++;
            }
        }

        \App\Code::insert($codesToInsert);
    }
}

I'll be explain a little my code. Because my codes are in alphabetic order, I put a field rand with a random number (I create this field for next make a query like Code::where('used', 0)->where('rand', rand(0,9)), by this way I retrieve a random code, not in alphabetic order)

The problem is when I run the function to generate the codes, its create only 135000 not $maxCodes, but I dont know why?..

Someone can help me?

Thanks!

PS.- Sorry for me english..

Krlinhos
  • 11
  • 2
  • Without sitting down and doing a lot of maths, I suspect you've not generated random codes - you've generated *every* possible combination of 5 letters from a set of 20 in alphabetical order. – symcbean Dec 09 '15 at 14:01
  • Not necessary, the user can set maxCode. All possibilities are 1.8Million in example code I have set 1.2Million – Krlinhos Dec 09 '15 at 14:22
  • I suspect the best way of generating codes for this requirement of length 5 is to generate all the combinations and shuffle them. The reason is that a large portion of the combinations are required and generating randon ones and checking to see if unique will generate a lot of collisions. see: [PHP algorithm to generate all combinations of a specific size from a single set](http://stackoverflow.com/questions/19067556/php-algorithm-to-generate-all-combinations-of-a-specific-size-from-a-single-set). – Ryan Vincent Dec 10 '15 at 10:05

1 Answers1

0

Code is commented.

Limitations: The closer to the limit of combinations then the 'duplicate generated counts' will get silly. I would expect it take about an 30 minutes to generate all the 1.2 million codes on my pc. I expect it to get a lot longer fo 1.5 million. Was interesting to do.

Details for the letter 'B' with samples including duplicate counts

$tryCount integer89166
$existCount integer68422

Timings:

Script start date and time : 2015-12-09 15:45:47.172
Script stop end date and time: 2015-12-09 15:46:40.109
Total execution time : 52.938000000000002

Sample:

'BXCLF' => integer 6
'BYRXS' => integer 6
'BPGGG' => integer 6
'BRQJJ' => integer 6
'BKJKQ' => integer 6
'BNQVG' => integer 6
'BLQSX' => integer 6
'BCGDH' => integer 5
'BBLFG' => integer 5
'BFBJQ' => integer 5 
'BRVRJ' => integer 5
'BJXTX' => integer 5

Hardware: PC - dual hamster powered.

The code for the RandomData class: Random data for use in PHP

The code:

<?php

include __DIR__ . '/__bootstrap__.php';

use app\system\encryption\RandomData;
use app\system\ElapsedTiming;

$maxCodes =  50000;
$codeSize = 5;
$characters = 'BCDFGHJKLMNPQRSTVWXY';
$cLength = strlen($characters) - 1;

$codesPerLetter = (int) $maxCodes / $cLength; // how many to generate for this initial letter

$timer = new ElapsedTiming();
$timer->start();

/**
 * @var RandomData
 */
$randSrc = RandomData::instance(); // source of random numbers

for ($i=0; $i < $cLength; $i++) { // use each char as an initial character

    $existCodes = array(); // current batch of codes in here
    $existCount = 0;

    $newCode = $characters[$i] . '0000'; // each `batch` starts with a different letter
                                         // then four random letters from the string...

    $tryCount = 0; // count the number of attemps to generate the required count
    while ($existCount < $codesPerLetter) {

        // generate a random 4 character string using all the characters
        for ($k = 1; $k < 5 ; $k++) { // first lether is fixed then fill in the rest
            // generate code...
            $char = $characters[$randSrc->int32(0, $cLength)];
            $newCode[$k] = $char;
        }

        $tryCount++;

        // test if exists
        if (isset($existCodes[$newCode])) { // store as keys not values
            $existCodes[$newCode]++;         // count duplicate generation
        }
        else {
            $existCodes[$newCode] = 1;       // a new one
            $existCount++;                   // closer to the target
        }
    }

    /**
     * Write this batch of codes to the array to the database...
     */
}

/* debug information */
$timer->stop();
arsort($existCodes);

$timer->printFullStats();

\Kint::dump($tryCount, $existCount, array_slice($existCodes, 0, 100));
Ryan Vincent
  • 4,483
  • 7
  • 22
  • 31
  • Sorry Ryan, but I dont understand you :( You say that with my code take about 30 minutes? But with your proposal takes less, is it? Then only I have to change my code by which you have put your mentioned? Thanks! – Krlinhos Dec 09 '15 at 20:52
  • @Krlinhos, Sorry for the confusion - I have only timed the code I provided. The major difference, compared to your code, is that it picks 4 random characters out of all 18 characters for each new code. It then checks to see if it is a new code. The timings are for my PC. It takes about a minute to generate 60000 (sixty thousand) unique codes. The differeence between the `trycount` and the `existCount` is the number of duplicates that were generated. – Ryan Vincent Dec 09 '15 at 21:56