34

I'm programming a script using PHP and MySQL and I want to get a unique id (consisting of a string: capitals and small letters with numbers) like: gHYtUUi5b. I found many functions in PHP that can generate such numbers but I'm afraid about how to ensure the id is unique!

UPDATE: uuid is long, I mean such id like: (P5Dc) an 11 alphanumeric char.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
assaqqaf
  • 1,575
  • 4
  • 21
  • 38

17 Answers17

63

EDIT: This answer has been flagged for being dangerous in the context of destroying a database. Do NOT use this code to generate unique ids in databases!

I use UUID() to create a unique value.

example:

insert into Companies (CompanyID, CompanyName) Values(UUID(), "TestUUID");
John
  • 1
  • 13
  • 98
  • 177
CSharpAtl
  • 7,374
  • 8
  • 39
  • 53
  • 2
    thanx but uuid is long, imean such id like: (P5Dc) a 11 alphanumeric char. – assaqqaf Sep 23 '09 at 18:00
  • It is long because it has to assure uniqueness :) generate it progragmatically if you really need to have it short, see my post – drAlberT Sep 23 '09 at 18:03
  • 13
    @assaqqaf, you can use UUID_SHORT() instead. – Vahid Jan 31 '13 at 07:47
  • 4
    Does generation of `UUID()` check if generated uuid already exists in table? In other words, is uuid always unique amongs the target table ? – Buksy May 11 '16 at 07:18
  • 3
    Could you please elaborate on why generating UUID by mysql could be "destroying databases"? – John Henry Dec 01 '22 at 13:07
23

You may like the way that we do it. I wanted a reversible unique code that looked "random" -a fairly common problem.

  • We take an input number such as 1,942.
  • Left pad it into a string: "0000001942"
  • Put the last two digits onto the front: "4200000019"
  • Convert that into a number: 4,200,000,019

We now have a number that varies wildly between calls and is guaranteed to be less than 10,000,000,000. Not a bad start.

  • Convert that number to a Base 34 string: "2oevc0b"
  • Replace any zeros with 'y' and any ones with 'z': "2oevcyb"
  • Upshift: "2OEVCYB"

The reason for choosing base 34 is so that we don't worry about 0/O and 1/l collisions. Now you have a short random-looking key that you can use to look up a LONG database identifier.

RJStanford
  • 619
  • 6
  • 5
  • I forgot - if you add 1,000,000,000 to the number after converting it back the first time, then you avoid the YYYYYYYY5 problem (mangling bases means that a large enough non-regular base number will look sufficiently random). Just don't forget to subtract it out on the way back :) – RJStanford Apr 21 '11 at 13:33
15

A programmatic way can be to:

  • add a UNIQUE INDEX to the field
  • generate a random string in PHP
  • loop in PHP ( while( ! DO_THE_INSERT ) )
    • generate another string

Note:

  • This can be dirty, but has the advantage to be DBMS-agnostic
  • Even if you choose to use a DBMS specific unique ID generator function (UUID, etc) it is a best practice to assure the field HAS to be UNIQUE, using the index
  • the loop is statistically not executed at all, it is entered only on insert failure
drAlberT
  • 22,059
  • 5
  • 34
  • 40
  • 1
    this the only solution in my mind, even before i post my question. is the way use resources. and at the same time isn't scientific. – assaqqaf Sep 23 '09 at 18:19
  • 1
    It is instead. If well implemented it is as safe as the DBMS based approach, resources are not a matter as if you generate a random string the probability of an INSERT fail is very very low (depending on the table dimension of course), so you loop is actually a single operation. – drAlberT Sep 23 '09 at 18:28
  • 2
    Unless of course your random number generator isn't completely random. Written race conditions are usually a Bad Idea ^TM – Billy ONeal Feb 15 '10 at 17:26
  • 9
    Rather than looping on failed insert (which can be caused by other reasons thus causing infinite loop), I'd loop the ID generation on finding the same ID, then insert and throw an exception on insert fail. You have the same statistic of only one loop, it's safer and provide information when the insert fails (for other reasons). – instanceof me Feb 15 '10 at 17:30
  • 1
    this way you are using 2 DB operations, being no more atomic. Better to check the reason of the failure inside the loop, once the failure is already occurred. – drAlberT Feb 16 '10 at 14:41
  • by adding a seed/salt which is for example the current date in seconds the chance of getting an duplicate string is very, very unlikely. – xorinzor Nov 28 '12 at 17:23
  • isnt that a first approach will be hitting database multiple time incase there is duplicate random number?is it good idea? – slier Oct 16 '14 at 12:48
10

If you use MySQL with version higher than 5.7.4, you can use the newly added RANDOM_BYTES function:

 SELECT TO_BASE64(RANDOM_BYTES(16));

This will result in a random string such as GgwEvafNLWQ3+ockEST00A==.

zah
  • 5,314
  • 1
  • 34
  • 31
9

How you generate the unique_ids is a useful question - but you seem to be making a counter productive assumption about when you generate them!

My point is that you do not need to generate these unique id's at the time of creating your rows, because they are essentially independent of the data being inserted.

What I do is pre-generate unique id's for future use, that way I can take my own sweet time and absolutely guarantee they are unique, and there's no processing to be done at the time of the insert.

For example I have an orders table with order_id in it. This id is generated on the fly when the user enters the order, incrementally 1,2,3 etc forever. The user does not need to see this internal id.

Then I have another table - unique_ids with (order_id, unique_id). I have a routine that runs every night which pre-loads this table with enough unique_id rows to more than cover the orders that might be inserted in the next 24 hours. (If I ever get 10000 orders in one day I'll have a problem - but that would be a good problem to have!)

This approach guarantees uniqueness and takes any processing load away from the insert transaction and into the batch routine, where it does not affect the user.

Rob Beer
  • 199
  • 2
  • 2
  • not wise really! generating a random string is so fast with server hardware that you've really gone over too much – azerafati Dec 30 '15 at 12:48
  • "What I do is pre-generate unique id's for future use, that way I can take my own sweet time and absolutely guarantee they are unique" How do you know the id doesn't already exist in the DB unless you check at insertion time? – Steven Byks Oct 12 '16 at 14:24
  • I really like this approach. Thanks for the suggestion – jdferreira Aug 19 '18 at 22:04
  • If one were to use this, he could use a method that creates the key(s) in the key table and returns it/them back to the function that performs the insert. This would eliminate the cap on nightly key generation. Wrap these into a transaction to avoid accumulating unused keys. – DanimalReks Feb 05 '21 at 12:06
  • To respond to Steven Byks ... you *can* guarantee uniqueness because when generating new id's you know all the ids already in the orders table and all the ids you have already prepared in your staging / top-up area, or whatever you want to call it. Part of the rationale for doing it this way is that these random ids are not really random - you have to check that they are not the same as any you already have, and this can take a bit of extra time. No need to check at insertion time - you've already guaranteed uniqueness - but *do* put a unique key on that database column just in case of bugs! – Rob Beer Aug 12 '21 at 07:55
3

Use UUID function.

I don't know the source of your procedures in PHP that generates unique values. If it is library function they should guarantee that your value is really unique. Check in documentation. You should, hovewer, use this function all the time. If you, for example, use PHP function to generate unique value, and then you decide to use MySQL function, you can generate value that already exist. In this case putting UNIQUE INDEX on the column is also a good idea.

Lukasz Lysik
  • 10,462
  • 3
  • 51
  • 72
3
DELIMITER $$

USE `temp` $$

DROP PROCEDURE IF EXISTS `GenerateUniqueValue`$$

CREATE PROCEDURE `GenerateUniqueValue`(IN tableName VARCHAR(255),IN columnName VARCHAR(255)) 
BEGIN
    DECLARE uniqueValue VARCHAR(8) DEFAULT "";
    DECLARE newUniqueValue VARCHAR(8) DEFAULT "";
    WHILE LENGTH(uniqueValue) = 0 DO
        SELECT CONCAT(SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1),
                SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789', RAND()*34+1, 1)
                ) INTO @newUniqueValue;
        SET @rcount = -1;
        SET @query=CONCAT('SELECT COUNT(*) INTO @rcount FROM  ',tableName,' WHERE ',columnName,'  like ''',newUniqueValue,'''');
        PREPARE stmt FROM  @query;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;
    IF @rcount = 0 THEN
            SET uniqueValue = @newUniqueValue ;
        END IF ;
    END WHILE ;
    SELECT uniqueValue;
    END$$

DELIMITER ;

And call the stored procedure as GenerateUniqueValue('tableName','columnName'). This will give you a 8 digit unique character everytime.

2

To get unique and random looking tokens you could just encrypt your primary key i.e.:

SELECT HEX(AES_ENCRYPT(your_pk,'your_password')) AS 'token' FROM your_table;

This is good enough plus its reversable so you'd not have to store that token in your table but to generate it instead.

Another advantage is once you decode your PK from that token you do not have to do heavy full text searches over your table but simple and quick PK search.

Theres one small problem though. MySql supports different block encryption modes which if changed will completely change your token space making old tokens useless...

To overcome this one could set that variable before token generated i.e.:

SET block_encryption_mode = 'aes-256-cbc';

However that a bit waste... The solution for this is to attach an encryption mode used marker to the token:

SELECT CONCAT(CONV(CRC32(@@GLOBAL.block_encryption_mode),10,35),'Z',HEX(AES_ENCRYPT(your_pk,'your_password'))) AS 'token' FROM your_table;

Another problem may come up if you wish to persist that token in your table on INSERT because to generate it you need to know primary_key for the record which was not inserted yet... Ofcourse you might just INSERT and then UPDATE with LAST_INSERT_ID() but again - theres a better solution:

INSERT INTO your_table ( token )
SELECT CONCAT(CONV(CRC32(@@GLOBAL.block_encryption_mode),10,35),'Z',HEX(AES_ENCRYPT(your_pk,'your_password'))) AS 'token'
FROM information_schema.TABLES 
WHERE  TABLE_SCHEMA = DATABASE() AND TABLE_NAME = "your_table";

One last but not least advantage of this solution is you can easily replicate it in php, python, js or any other language you might use.

Community
  • 1
  • 1
1

For uniqueness what I do is I take the Unix timestamp and append a random string to it and use that.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Sabeen Malik
  • 10,816
  • 4
  • 33
  • 50
  • u can use microtime and convert it to hex to get the string effect or atleast in most cases , this would work ok if u dont have high volume traffic / insertions in db. – Sabeen Malik Sep 23 '09 at 18:17
  • "i use it to access recourse through url, if i use auto_increment it will be easy to suggest" .. keeping that in mind... if insertions are high .. you can do it the other way around .. use auto increment to add normal ids in the DB .. on the site , where you display the links , have the id encrypted with a known key and then decrypt it back with that key before making the DB query. – Sabeen Malik Sep 23 '09 at 19:20
  • 1
    if u are already using this .. i would suggest keep it this way .. i have no idea how much traffic u have or the db size .. but keep in mind that varchar type look ups are slower than integer look ups and re-indexing on inserts/updates is costly as well. ideally to avoid calculating the hash each time .. just create another field in the db which stores this hash .. show that hash to the user from there..atleast that would avoid the encryption part each time the link needs to be displayed. i personally prefer putting more load on the php end rather than mysql. – Sabeen Malik Sep 23 '09 at 22:21
  • i agree with you 80%, but i think there are other solution used by large site like: youtube to keep id simple and short.. – assaqqaf Sep 23 '09 at 23:24
  • youtube stores its data on CDNs , no one really know how they generate the unique ID .. but my guess is that unique ID contains not only the record ID but the server/cluster/CDN addressing as well .. so that unique ID is basically a composite of more than 1 values, which is decrypted into a more meaningful and fuller explanation of the record and its location. – Sabeen Malik Sep 23 '09 at 23:31
1

Below is just for reference of numeric unique random id...

it may help you...

$query=mysql_query("select * from collectors_repair");
$row=mysql_num_rows($query);
$ind=0;
if($row>0)
{
while($rowids=mysql_fetch_array($query))
{
  $already_exists[$ind]=$rowids['collector_repair_reportid'];
}
}
else
{
  $already_exists[0]="nothing";
}
    $break='false';
    while($break=='false'){
      $rand=mt_rand(10000,999999);

      if(array_search($rand,$alredy_exists)===false){
          $break='stop';
      }else{

      }
    }

 echo "random number is : ".$echo;

and you can add char with the code like -> $rand=mt_rand(10000,999999) .$randomchar; // assume $radomchar contains char;

warren
  • 32,620
  • 21
  • 85
  • 124
Aravinth
  • 11
  • 3
1

crypt() as suggested and store salt in some configuration file, Start salt from 1 and if you find duplicate move to next value 2. You can use 2 chars, but that will give you enough combination for salt.

You can generate string from openssl_random_pseudo_bytes(8). So this should give random and short string (11 char) when run with crypt().

Remove salt from result and there will be only 11 chars that should be enough random for 100+ millions if you change salt on every fail of random.

vard
  • 4,057
  • 2
  • 26
  • 46
gorrc
  • 11
  • 1
0

You might also consider using crypt()* to generate a [nearly-guaranteed] unique ID inside your contraints.

warren
  • 32,620
  • 21
  • 85
  • 124
0
 <?php
    $hostname_conn = "localhost";
    $database_conn = "user_id";
    $username_conn = "root";
    $password_conn = "";
     $conn = mysql_pconnect($hostname_conn, $username_conn,   $password_conn) or trigger_error(mysql_error(),E_USER_ERROR); 
   mysql_select_db($database_conn,$conn);
   // run an endless loop      
    while(1) {       
    $randomNumber = rand(1, 999999);// generate unique random number               
    $query = "SELECT * FROM tbl_rand WHERE the_number='".mysql_real_escape_string ($randomNumber)."'";  // check if it exists in database   
    $res =mysql_query($query,$conn);       
    $rowCount = mysql_num_rows($res);
     // if not found in the db (it is unique), then insert the unique number into data_base and break out of the loop
    if($rowCount < 1) {
    $con = mysql_connect ("localhost","root");      
    mysql_select_db("user_id", $con);       
    $sql = "insert into tbl_rand(the_number) values('".$randomNumber."')";      
    mysql_query ($sql,$con);        
    mysql_close ($con);
    break;
    }   
}
  echo "inserted unique number into Data_base. use it as ID";
   ?>
0

USE IT

    $info = random_bytes(16);
    $info[6] = chr(ord($info[6]) & 0x0f | 0x40); 
    $info[8] = chr(ord($info[8]) & 0x3f | 0x80); 
    $result =vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($info), 4));
    return $result;
Devang Hire
  • 174
  • 1
  • 5
-1

This generates random ids:

CREATE TABLE Persons (
    ID Integer PRIMARY KEY AUTOINCREMENT,
    LastName varchar(255) NOT NULL,
    FirstName varchar(255),
    Age int
);
Rob
  • 26,989
  • 16
  • 82
  • 98
-1

You could use Twitter's snowflake.

In short, it generates a unique id based on time, server id and a sequence. It generates a 64-bit value so it is pretty small and it fits in an INT64. It also allows for sorting values correctly.

https://developer.twitter.com/en/docs/basics/twitter-ids

In sum, it allows multiple servers, highly concurrency, sorting value and all of them in 64 bits.

Here it is the implementation for MySQL

https://github.com/EFTEC/snowflake-mysql

It consists of a function and a table.

magallanes
  • 6,583
  • 4
  • 54
  • 55
-1

After searching, I found this article.

Plan A: UUID or GUID.

Plan B: RAND:

SELECT FLOOR(rand() * 90000 + 10000);

Plan C: Full MD5:

SELECT MD5(NOW());

Plan D: Part of MD5:

SELECT LEFT(MD5(NOW()), 4);

Plan E: Encrypt/Decrypt

SELECT HEX(AES_ENCRYPT(123, 'foo')); 
SELECT AES_DECRYPT(UNHEX('C7FC2E77A3D894D2BA02F2214E37F107'), 'foo');
Hi computer
  • 946
  • 4
  • 8
  • 19