11

As much as I like using GUIDs as the unique identifiers in my system, it is not very user-friendly for fields like an order number where a customer may have to repeat that to a customer service representative.

What's a good algorithm to use to generate order number so that it is:

  • Unique
  • Not sequential (purely for optics)
  • Numeric values only (so it can be easily read to a CSR over phone or keyed in)
  • < 10 digits
  • Can be generated in the middle tier without doing a round trip to the database.

UPDATE (12/05/2009) After carefully reviewing each of the answers posted, we decided to randomize a 9-digit number in the middle tier to be saved in the DB. In the case of a collision, we'll regenerate a new number.

Jason
  • 4,232
  • 3
  • 23
  • 31
  • Related: http://stackoverflow.com/questions/721497/database-wide-unique-yet-simple-identifiers-in-sql-server – John Rasch Dec 01 '09 at 04:01
  • 7
    Naive question: why not sequential? What does it buy you? – Mathias Dec 01 '09 at 04:06
  • 9
    @Mathias - I do not want our competitors to know how many orders I've completed in a given time span. – Jason Dec 01 '09 at 04:59
  • @John - The accepted answer in the question you referenced is exactly what I don't want to do which is to rely on the database as a tie breaker. – Jason Dec 01 '09 at 05:00
  • @Jason - why won't you do what governments do with citizens IDSs: Increase the global counter with a random amount from time to time. – Elazar Leibovich Dec 04 '09 at 12:35

10 Answers10

10

If the middle tier cannot check what "order numbers" already exists in the database, the best it can do will be the equivalent of generating a random number. However, if you generate a random number that's constrained to be less than 1 billion, you should start worrying about accidental collisions at around sqrt(1 billion), i.e., after a few tens of thousand entries generated this way, the risk of collisions is material. What if the order number is sequential but in a disguised way, i.e. the next multiple of some large prime number modulo 1 billion -- would that meet your requirements?

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 1
    +1 This is a linear congruential random number generator. Note that the sequence can be deduced from a few numbers, it is only obfuscated. – starblue Dec 01 '09 at 09:24
  • 1
    You can't really deduce a sequence if there is a real possibility that you're missing some numbers. E.g. "1 ? 2 ? 4" could very well be sequential. – MSalters Dec 02 '09 at 10:12
6

<Moan>OK sounds like a classic case of premature optimisation. You imagine a performance problem (Oh my god I have to access the - horror - database to get an order number! My that might be slow) and end up with a convoluted mess of psuedo random generators and a ton of duplicate handling code.</moan>

One simple practical answer is to run a sequence per customer. The real order number being a composite of customer number and order number. You can easily retrieve the last sequence used when retriving other stuff about your customer.

James Anderson
  • 27,109
  • 7
  • 50
  • 78
  • +1. True. We use the sequence per customer approach. It works flawless. You have less code to worry. :) – Guru Dec 01 '09 at 06:50
  • @James: It's not that we are scared of accessing the - gasp - database. It's because you can have a race condition between 2 orders being assigned the same order number if you go to the database and return to the middle tier to find out the next logical order number. – Jason Dec 01 '09 at 16:46
  • Table with 1 row 1 column. 1 command increments and returns the result. Everything goes through the command. No worries about a race. – s_hewitt Dec 01 '09 at 19:05
  • 3
    Most DBMSes suport a "sequence" feature, and most DBMSes also support sub-transactions or spearate units of work. Just get your next sequence outside the normal UOW -- you will end up with gaps in the sequence becuase of cancelled UOWs aborted oreders etc. but that doesnt really matter. You will certainly not have race conditions. – James Anderson Dec 02 '09 at 01:19
  • @James Anderson I agree, I disagree. If the customer does not yet exist (new signup), then the entire user creation/order-storage/creditcard-transaction process has to be wrapped in a DB transaction block (since you must create user to get the userid, and get next-id from orders table). The benefit of generating a GUID or equivalent is that, up front, you have a unique ID to pass to the payment processor. If the payment succeeds, then, at that point, you persist user/order to DB in a rollback-able transaction. Much cleaner implementation, IMO. Tradeoffs either way, but I prefer GUIDs – virtualeyes Jun 01 '12 at 08:21
5

One simple option is to use the date and time, eg. 0912012359, and if two orders are received in the same minute, simply increment the second order by a minute (it doesn't matter if the time is out, it's just an order number).

If you don't want the date to be visible, then calculate it as the number of minutes since a fixed point in time, eg. when you started taking orders or some other arbitary date. Again, with the duplicate check/increment.

Your competitors will glean nothing from this, and it's easy to implement.

Will
  • 1,561
  • 9
  • 6
  • To increase the granularity you can add seconds and milliseconds. But you still will run into race conditions if you don't have a single synchronized order number factory. – Andreas Kraft Dec 07 '09 at 01:49
2

Use primitive polynomials as finite field generator.

Luka Rahne
  • 10,336
  • 3
  • 34
  • 56
1

Maybe you could try generating some unique text using a markov chain - see here for an example implementation in Python. Maybe use sequential numbers (rather than random ones) to generate the chain, so that (hopefully) the each order number is unique.

Just a warning, though - see here for what can possibly happen if you aren't careful with your settings.

a_m0d
  • 12,034
  • 15
  • 57
  • 79
1

One solution would be to take the hash of some field of the order. This will not guarantee that it is unique from the order numbers of all of the other orders, but the likelihood of a collision is very low. I would imagine that without "doing a round trip to the database" it would be challenging to make sure that the order number is unique.

In case you are not familiar with hash functions, the wikipedia page is pretty good.

momeara
  • 1,341
  • 2
  • 17
  • 29
  • "Very low" is optimistic: with just 1 billion possible results, the chance of at least one collision for 60,000 orders is over 16%. Cfr the **Cast as a collision** paragraph in http://en.wikipedia.org/wiki/Birthday_paradox : probability of no collisions in n orders (with the best possible hash: one as good as perfect randomness!) is about `exp(-n*(n-1)/(2.0*10e9))`. – Alex Martelli Dec 01 '09 at 16:10
1

You could base64-encode a guid. This will meet all your criteria except the "numeric values only" requirement.

Really, though, the correct thing to do here is let the database generate the order number. That may mean creating an order template record that doesn't actually have an order number until the user saves it, or it might be adding the ability to create empty (but perhaps uncommitted) orders.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • 1
    In case anyone is interested, here an article on how to base64-encode a GUID: http://www.codinghorror.com/blog/archives/000409.html – Jason Dec 01 '09 at 05:03
1

Your 10 digit requirement is a huge limitation. Consider a two stage approach.

  1. Use a GUID
  2. Prefix the GUID with a 10 digit (or 5 or 4 digit) hash of the GUID.

You will have multiple hits on the hash value. But not that many. The customer service people will very easily be able to figure out which order is in question based on additional information from the customer.

John
  • 5,735
  • 3
  • 46
  • 62
0

The straightforward answer to most of your bullet points:

Make the first six digits a sequentially-increasing field, and append three digits of hash to the end. Or seven and two, or eight and one, depending on how many orders you envision having to support.

However, you'll still have to call a function on the back-end to reserve a new order number; otherwise, it's impossible to guarantee a non-collision, since there are so few digits.

gary
  • 511
  • 6
  • 11
0

We do TTT-CCCCCC-1A-N1.

  • T = Circuit type (D1E=DS1 EEL, D1U=DS1 UNE, etc.)
  • C = 6 Digit Customer ID
  • 1 = The customer's first location
  • A = The first circuit (A=1, B=2, etc) at this location
  • N = Order type (N=New, X=Disconnect, etc)
  • 1 = The first order of this kind for this circuit
Josh
  • 68,005
  • 14
  • 144
  • 156
  • Although, what happens on `...-X10`? ^^ –  Jan 04 '11 at 05:57
  • Well the struct throws an ArgumentException but in reality, we'd probably give up after 9 failed attempts at disconnecting a circuit and just take scissors to it. – Josh Jan 04 '11 at 06:42