1

Currently I run a web app that heavily relies on short URLs that link to the original content, these URLs need to be as short as possible.

At the moment, I use a simple MySQL data store for storing these objects and an incrementing counter -> base 62 conversion to generate short urls that will grow over time. We need to move to a more distributed and scalable environment.

What is the best way of generating small, unique short Urls for content in a distributed data store? To be specific we plan to use either Mongo or DynamoDB.

Charles
  • 50,943
  • 13
  • 104
  • 142
tarnfeld
  • 25,992
  • 41
  • 111
  • 146

1 Answers1

4

I'd suggest you have a look at Jon Skeet's description of the HiLo algorithm here: What's the Hi/Lo algorithm?

For the specific use case of mongo, see http://dllhell.net/2010/07/23/on-sequences-with-mongodb-and-norm/

Community
  • 1
  • 1
Chris Shain
  • 50,833
  • 6
  • 93
  • 125
  • I started working on a network service that would run on each node in the distributed app (the app servers) and the code would request an ID from the service, the service would allocate a range, say, 4501-5000 and use those, then go get some more, from some atomic counter store... though I wasn't sure if this was the best approach. – tarnfeld Mar 30 '12 at 22:39
  • Oh cool! For reference - https://github.com/tarnfeld/snowey :-) Currently I have a redis implementation (for the atomic range store) but i'm working on the dynamodb one. – tarnfeld Mar 30 '12 at 23:06
  • Perhaps a larger scale than you'll need, but an interesting project is Twitter's Snowflake: https://github.com/twitter/snowflake#readme – Joshua Martell Mar 31 '12 at 03:01
  • I spent a lot of time looking into Snowflake and Boundary/flake but the problem with both of those is that they aren't incremental, increasing from small numbers, and turning a snowflake ID into a base 62 string makes it about 18 characters long... which for me is too long. – tarnfeld Apr 01 '12 at 08:44