6

I'm trying to create a short, 5 or 6 character, invitation code that will be generated upon the creation of a "group" on my site. The group info is stored in the group table. Users wishing to join a group must have an invitation code--it isn't necessary that they know anything else.

Obviously, I need the invitation code to be unique, and I am hoping to generate unique strings without a double check, but figuring out the code has been difficult. I've been reading dozens of SO questions and answers. This is what I've come up with:

When inserting the group info, such as name, into the group table, the row is given a unique, auto-incrementing id, naturally.

1) Grab that id

2) add it to 1234

3) simply put the values next to eachother after converting the team name from base36 to base10 eg. "NewYorkYankees" is base10(3994055806027582482648) [1263399405580602758248264820130221060827])

4) convert to base 36

5) INSERT the code into the database

This is guaranteed to be unique for every group, right? Zero chance of collision? I say this because it isn't at all random; I start with something unique, and I keep it unique, never introducing random data.

I have a couple issues though, since group names are repeatable, how do I grab the row id upon creation/INSERTion? This won't work, but it's where I'm at:

$query = "SELECT id FROM groups WHERE gname = :gname";
...
$uid = $result + '1234';
$hex = md5(":gname NOW()" . uniqid("$uid", true));
base_convert($hex, 10, 36);
intval($str, 36);

$query = "INSERT...";

Unique, short, but unpredictable without all the right pieces, which aren't available to users.

David
  • 1,175
  • 1
  • 16
  • 29

3 Answers3

2

It's described here how to get the last inserted auto_increment value.

After having an ID you need to use symmetric key encryption like AES with a secret key. It is guaranteed to be unique (as it can be transformed back to the original plaintext - which is called decryption).

You can tune the block size to get the desired length (in bits). With base64 the length will be a multiple of 4 (as 8bit characters are encoded in triplets, resulting in a 4 character block in base64).

Community
  • 1
  • 1
vbence
  • 20,084
  • 9
  • 69
  • 118
  • I found that shortly before you posted it. Thank you. You gave a decent solution, but multiples of four, even at just 8 characters, is too long for my purpose. – David Feb 21 '13 at 13:25
  • 1
    @David May be only 4 printable characters (24bit) is enough. It will cover your IDs from 1 to 16 million - When you reach that much activity on your site you certainly want to rebuild it anyway. :) - Jokes aside you can get 6 printable characters if you drop the padding equal signs from the end of your base64 string (that will give you 40 bits which covers ids from 1 - 1,000,000,000,000). – vbence Feb 21 '13 at 13:34
  • I understand the low probability, but I feel like 4 characters makes guessing a valid value too easy. Your answer is a good one, and fine for many application, but it doesn't offer my precisely what I'm asking for. I don't particularly like marking my own answers as accepted. So, even though what I found is more suited to my request, I may still accept your answer. – David Feb 21 '13 at 23:21
  • 2
    @David Self-accepted answers are totally all right. If the `Hashids` library does exactly what you need that's the most helpful answer for future visitors of the question. – vbence Feb 22 '13 at 07:56
2
$query = "INSERT INTO groups (gname, gadmin) VALUES (:gname, :gadmin)";
    $query_params = array( ':gname' => $trimmed['gname']
                           ':gadmin' => $userid );

    try {
        $stmt = $db->prepare($query);
        $result = $stmt->execute($query_params);
        $gid = $db->lastInsertId();
    }
    catch(PDOException $ex) {
        die("Failed to run query: " . $ex->getMessage());
    }

    // Use this library: https://github.com/ivanakimov/hashids.php
    $hashids = new Hashids\Hashids('this is my constant salt', 5,
                'abcdefghijklmnpqrstuvwxyz0123456789');

    $hash = $hashids->encrypt($gid);

    $query = "UPDATE groups SET invite = '$hash' WHERE id = '$gid'";
    ...

The library in question handles the heavy lifting. It doesn't actually hash, per se. It encrypts the input; since my row ids are unique, so is the encrypted result. I have no need of decrypting the "hashes," but the option exists. I can't strictly define the length, but I can set a minimum and have room to grow. Also, as you can see, it allows me to define an 'alphabet' as well.

David
  • 1,175
  • 1
  • 16
  • 29
0

Why not concatenate the IP address or user-id of the person creating the group, and the time including milliseconds, and then MD5 or SHA-256 that to generate the string you publish? That will certainly be both unpredictable and nonrepeating.

D Mac
  • 3,727
  • 1
  • 25
  • 32
  • It will be entirely too long and has possible collisions. – David Feb 21 '13 at 12:48
  • Possible collisions? How? It is practically impossible for two requests to arrive in the same millisecond from the same IP address, and collisions with MD5 difficult to create and unlikely - and SHA collisions haven't been discovered yet (and are mathematically extremely unlikely). – D Mac Feb 21 '13 at 13:02
  • 1
    First of all, absolutely too long of a result. Second, if those long results are somehow shortened, then the probability of collisions increases, especially when approaching my desired length. – David Feb 21 '13 at 13:27