1

I need to generate a suffix to uniquify a value. I thought of using the current data and time but need the suffix to be no more than 5 bytes long. Are there any hashing methods that can produce a hash of 5 bytes or less from a date in yyyyMMddHHmmss format?

Any other ideas? It would be simple to maintain a running counter and use the next value but this I would prefer not to have to rely on any kind of stored value.

Steve Crane
  • 4,340
  • 5
  • 40
  • 63
  • What's your range of dates? The obvious thing that springs to mind is [Unix time](http://en.wikipedia.org/wiki/Unix_time), of course, but that also depends on the encoding requirements for your hash. – Matt Gibson Aug 22 '11 at 09:39

1 Answers1

4

In case you do not need to rely on printable characters, I would suggest, that you simply use the Unix timestamp. That will work great even with 4 Bytes (until January 19, 2038).

If you want to use only a subset of characters, I would suggest, that you create a list of values that you want to use.

  • Let's say you want to use the letters (capital and small) and the digits -> 62 values.
  • Now you need to convert the timestamp into base-62. Let's say your timestamp is 100:
  • 100 = (1 * 62^1) + (38 * 62^0)
  • If you have stored your printable value in an array, you could use the coefficients 1 and 38 as an index into that array.

If you chose your base to small, five bytes will not be enough. In that case you can either substract a constant from the timestamp (which will buy you some time) or you can estimate when duplicate timestamps will occur and if that date is past your retirement date ;-)

bjoernz
  • 3,852
  • 18
  • 30
  • 1
    Just wanted to add: If you do not need to use times before 1970 but want to avoid the [Y2K38 bug](http://en.wikipedia.org/wiki/Year_2038_problem) you can use 32 bit/4 byte UNIX time stamps for storage if you make sure that it is always treated as **unsigned** value. That will postpone the problem until 2106. – snap Aug 22 '11 at 10:30
  • This is perfect. I found a nice class that does the conversion in [this answer](http://stackoverflow.com/questions/529647/need-a-smaller-alternative-to-guid-for-db-id-but-still-unique-and-random-for-url/529852#529852) and instead of using Unix time I used midnight on the day I wrote the code as the origin, extending the available range. – Steve Crane Aug 23 '11 at 10:14