7

I have a system that requires a unique 6-digit code to represent an object, and I'm trying to think of a good algorithm for generating them. Here are the pre-reqs:

  • I'm using a base-20 system (no caps, numbers, vowels, or l to prevent confusion and naughty words)
    • The base-20 allows 64 million combinations
  • I'll be inserting potentially 5-10 thousand entries at once, so in theory I'd use bulk inserts, which means using a unique key probably won't be efficient or pretty (especially if there starts being lots of collisions)
  • It's not out of the question to fill up 10% of the combinations so there's a high potential for lots of collisions
  • I want to make sure the codes are non-consecutive

I had an idea that sounded like it would work, but I'm not good enough at math to figure out how to implement it: if I start at 0 and increment by N, then convert to base-20, it seems like there should be some value for N that lets me count each value from 0-63,999,999 before repeating any.

For example, going from 0 through 9 using N=3 (so 10 mod 3): 0, 3, 6, 9, 2, 5, 8, 1, 4, 7.

Is there some magic math method for figuring out values of N for some larger number that is able to count through the whole range without repeating? Ideally, the number I choose would sort of jump around the set such that it wasn't obvious that there was a pattern, but I'm not sure how possible that is.

Alternatively, a hashing algorithm that guaranteed uniqueness for values 0-64 million would work, but I'm way too dumb to know if that's possible.

Dan Breen
  • 12,626
  • 4
  • 38
  • 49
  • Prime numbers...I guess I really am bad at math; it seems obvious in hindsite. Thanks to everyone who answered. I gave credit to phantombrain for answering first. – Dan Breen Aug 10 '09 at 23:59
  • 1
    Since you mention tinyurl, my answer to this question might be applicable: http://stackoverflow.com/questions/1051949/map-incrementing-integer-range-to-six-digit-base-26-max-but-unpredictably/1052896#1052896 – FogleBird Aug 11 '09 at 03:07
  • Don't forget to double the codes to make them non-consecutive. – Beta Aug 11 '09 at 15:04
  • possible duplicate of [Creating your own Tinyurl style uid](http://stackoverflow.com/questions/190701/creating-your-own-tinyurl-style-uid) – user Jul 04 '13 at 05:34

4 Answers4

8

All you need is a number that shares no factors with your key space. Easiest value is to use a prime number. You can google for large primes, or use http://primes.utm.edu/lists/small/10000.txt

Cullen Walsh
  • 4,318
  • 1
  • 17
  • 12
1

Any prime number which is not a factor of the length of the sequence should be able to span the sequence without repeating. For 64000000, that means you shouldn't use 2 or 5. Of course, if you don't want them to be generated consecutively, generating them 2 or 5 apart is probably also not very good. I personally like the number 73973!

Nick Lewis
  • 4,150
  • 1
  • 21
  • 22
1

There is another method to get a similar result (jumping over the entire set of the values without repeating, nonconsequtively), without using the primes - by using maximum length sequences, which you can generate using specially constructed shift registers.

Andrew Y
  • 5,107
  • 2
  • 28
  • 29
0

My math is a bit rusty, but I think you just need to ensure that the GCF of N and 64 million is 1. I'd go with a prime number (that doesn't divide evenly into 64 million) just in case though.

Jonathan Rupp
  • 15,522
  • 5
  • 45
  • 61