0

I have a users table and I need to create a unique, random looking alphanumeric "id" for each user (they already have autoincrementing ids as usually). This identifier must:

  • Be unique
  • Be random looking
  • Match the pattern AAAA-1234 (4 letters, 4 numbers)

Is there a better way than to keep randomly generating strings until I find one that is not in the database yet?

nxu
  • 2,202
  • 1
  • 22
  • 34
  • when you say 'random looking' do you mean that the numbers shouldn't be consequential? – Pavel Petrov Jun 14 '16 at 06:12
  • @PavelPetrov exactly – nxu Jun 14 '16 at 06:13
  • May be the algorithm you need is similar to that that generates credit card numbers. Probably you'll get better answer if you ask this question in Cryptography SE. – Pavel Petrov Jun 14 '16 at 06:27
  • @PavelPetrov I don't think it belongs there, it is an algorithmical problem rather than cryptographicaly – nxu Jun 14 '16 at 06:31
  • So performance is not exactly of concern? – Drew Jun 14 '16 at 10:55
  • @Drew It's not the most important aspect but it should be reasonable – nxu Jun 14 '16 at 11:02
  • 2
    I guess the question is, how much time and effort are you willing to spend to make it look cute amidst diverging from normal risk-adverse recommendations (ie: what you are currently doing that is just fine enough). What works does get boring afterall, and we just create silly work for ourselves, now don't we. – Drew Jun 14 '16 at 11:14
  • +1 what Drew said: more information is needed about what you want and what you don't want. Are possible collisions (with a super-small probability) okay? Is the performance with random-hashing and checking for duplicates not fast enough? Is it a one-time process or are you doing this very often (same data; different data)? (the resulting identifier looks really small; is your data-set small too?) You could build some approach with perfect-hashing (which maps injectively to some unique numbers which could index a-priori created random-identifiers); but it's only usable in *some* scenarios. – sascha Jun 14 '16 at 11:16
  • Right, 4.5B possibilities for saturation. A somewhat related performance [read](http://stackoverflow.com/a/34015333) of mine for the chronically bored. – Drew Jun 14 '16 at 11:25
  • Do you need there to be redundancy, such that some letter-number combinations are immediately detected as being invalid? – sh1 Jun 15 '16 at 04:13

1 Answers1

2

Assign each user an integer in boring old sequential order (or use that other id you mentioned). Call it $x.

Set $x = (($x + 2135587861) * 2654435769) & 0xffffffff.

Set $x = $x ^ ($x >> 15).

Set $x = (($x + 2135587861) * 2654435769) & 0xffffffff again.

Calculate $x % 26 and choose a letter a-z based on the result. Set $x = $x / 26. Repeat four times (I don't know PHP, so you get verbal instructions here).

Calculate $x % 10 and choose a digit 0-9 based on the result. Set $x = $x / 10. Repeat four times.

First six results I get are:

HSQG-2102
DNQO-1176
TEKJ-5435
EHWX-6540
UPPH-0450
MVIX-5036

It's not exactly perfect, but it's non-obvious. Perhaps that's sufficient.

Also, it only works for the first 4 billion(ish) users before you get collisions, but that's only a little way off of the limit of the string format anyway.

sh1
  • 4,324
  • 17
  • 30
  • I don't really understand the numbers here, but if I see correctly this basically means that each $x will map to exactly one string, which will be unique until $x < 16^8, right? – nxu Jun 15 '16 at 06:25
  • 1
    Yes, they're unique up to that limit. All the magic numbers are arbitrary and I just picked some that seemed to work. I used the first 64 bits of the golden ratio. The only restriction is that the multiplier must be odd -- this has the mathematical property that every possible input maps to a unique output even after `& 0xffffffff`. Since all of the operations have that same property, we know that we're not folding any two inputs into the same output (collision), and that we can theoretically reverse the operation to discover the original number. – sh1 Jun 15 '16 at 17:35
  • Thank you for the explanation! – nxu Jun 16 '16 at 05:38