2

I have a requirement on a project where

  • I need to generate unique ID's.
  • ID's must be upper case.
  • I cannot check database to see if ID has been used previously.

We expect to have many millions of records added to database every month.

I have tried solutions here: PHP: How to generate a random, unique, alphanumeric string? and while they seem to work at first, my testing has shown there would be duplicates over time.

Now I am looking at using uniqid with a prefix. The problem I found using uniqid without a prefix is that duplicates will be generated when simultaneous requests come into server at the same exact time. I am hoping using a prefix would solve this.
I am thinking of using this function:

private function generate_id()
{
    $alpha_numeric = 'ABCDEFGHIJKLMNPQRSTUVWXYZ0123456789';
    $max = strlen($alpha_numeric);
    $prefix = '';

    for ($i = 0; $i < 5; $i++)
    {
        $prefix .= $alpha_numeric[random_int(0, $max - 1)];
    }
    return strtoupper(uniqid($prefix));
}

The prefix would be a 5 character alphanumeric string. Would this be enough to satisfy my requirements?

*****Edit*****

Using a UUID as suggested would be the best way to limit the chance of collision but it has been decided to go with the approach above but increase the prefix to 7 characters. The chance of a collision if two ID's where generated at the same millisecond would be around 1 in 8.3 million. That has been deemed acceptable by the higher ups.

Tom Vaughan
  • 390
  • 2
  • 16
  • 3
    Maybe you can use some library like this: https://github.com/ramsey/uuid to render UUID? – Tomasz Aug 16 '18 at 14:56
  • Maybe, you could use an array or something similar to identify what id's have already been used. – Comp Aug 16 '18 at 14:57
  • 1
    @Comp So how is OP going to a) keep a 100,000,000 entry array in memory and b) load it without doing a query on the database – RiggsFolly Aug 16 '18 at 15:08
  • If you use MSSQL server as your Database Engine you could use the `GUID` (Globally Unique IDentifier) approach using the `NEWID()` function. Please read more here: https://learn.microsoft.com/en-us/sql/t-sql/functions/newid-transact-sql?view=sql-server-2017 – gkoul Aug 16 '18 at 16:09
  • @RiggsFolly without query would go but yes array would be too big I didn't think about that. – Comp Aug 16 '18 at 17:43

4 Answers4

1

If you use Composer or external libraries see https://github.com/ramsey/uuid

or this function may meet your needs. For your needs strtoupper the result:

/**
 * generate
 *
 * Returns a version 4 UUID
 *
 * @access public
 * @return string
 */
public static function generate()
{
    $data = openssl_random_pseudo_bytes(16);

    $data[6] = chr(ord($data[6]) & 0x0f | 0x40); // set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3f | 0x80); // set bits 6-7 to 10

    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

See https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)

Ron Dobley
  • 143
  • 8
0

Have you considered using an Unique key in the database to enforce uniqueness? In which case you won't have to check for duplicates yourself, but will generate unique value and attempt to insert the record in the DB until you succeed.

If MySQL then read this - Using MySQL UNIQUE Index To Prevent Duplicates. If not - look up the documentation of your database of choice.

uniquid does not guarantee uniqueness of return value! Use the function with more_entropy set to TRUE to increase chances of unique value.

return strtoupper(uniqid($prefix), true);

Is in absolutely necessary to limit yourself to only uppercase letters and numbers? This will reduce the maximum number of unique values generated from the function opposed to using uppercase, lowercase, numbers and symbols.

You can also consider cryptographic functions to increase randomness.

Stoil Ivanov
  • 104
  • 2
  • 3
  • Yes, we will be doing that. But, the way this system is going to be designed, the request to server and response back to client will be made before insert to database. – Tom Vaughan Aug 16 '18 at 23:23
0

If you are using PHP7 take a look at http://php.net/manual/en/function.random-bytes.php

e.g.

<?php
echo strtoupper(bin2hex(random_bytes(32)));
?>

Should be unique enough for your requirements, use more bytes if you feel you need to.

Chris Wheeler
  • 1,623
  • 1
  • 11
  • 18
0

Generally speaking - there will always be possible duplicates when you can't check the database for existing values. All you can do is to reduce probability of duplicates to be low enough for your use case. This is idea behind GUID.

If you really can't access the database and if you are really limited to upper-case characters then I would recommend generating GUID with uniqid function, then removing characters you don't want and converting to uppercase. If you are afraid that duplicates might occur, concatenate two or more GUIDs to reduce this probability.

Something like:

$unique_string = str_replace(".", "", strtoupper(uniqid(uniqid(uniqid(), true), true)));
lot
  • 1,434
  • 18
  • 23