5

Each user object in my database has an incremental ID (1, 2, 3, ...). The URL to view a user's profile contains the ID of the user object; e.g. http://www.example.com/users/1. This way everyone can see how many users there are on the website, how fast the userbase is growing etc. I don't want to give that information away.

I would like to convert the incremental ID to a fixed length string in Base58 format, so the URL would look like http://www.example.com/users/2WNrx2jq184 Also, I need the reverse function that converts the string back to the original ID. The reverse function should not be easy to reverse engineer.

The best Python code I found for this purpose is https://github.com/JordanReiter/django-id-obfuscator. It is very good, but in some cases it adds a 0 and/or . character, which leads to strings that are not in Base58 and not of fixed length. (See utils.py lines 24 and 29.)

How can I improve django-id-obfuscator to result in fixed length base58 obfuscated IDs, or how can I create such obfuscated IDs in Python?

Korneel
  • 1,487
  • 14
  • 40
  • 1
    I presume you want to avoid creating a random number and storing its reference to the actual ID somewhere in the database? – vgru Jun 15 '12 at 09:42
  • https://github.com/JordanReiter/django-id-obfuscator/blob/master/id_obfuscator/base58.py - this does not look like containing `0` or `.`. – eumiro Jun 15 '12 at 09:42
  • 1
    @eumiro https://github.com/JordanReiter/django-id-obfuscator/blob/master/id_obfuscator/utils.py - it happens here – Korneel Jun 15 '12 at 09:43
  • @Groo Yes I want to avoid that. It's a Google App Engine application. Retrieving an object by id has lower cost than retrieving it with a query. – Korneel Jun 15 '12 at 09:45
  • related: http://stackoverflow.com/questions/8554286/obfuscating-an-id – mensi Jun 15 '12 at 11:57
  • If Hasty Pudding is not suitable, then it is easy enough to write your own Feistel cypher with any even numbered bit size. It won't be secure, four rounds is enough, but it will be enough to obfuscate. – rossum Jun 16 '12 at 11:22

2 Answers2

6

If you want to properly do this, take your user ID, pad it with leading zeros, then encrypt it with something like AES and encode the result with base58. To get the ID back, just decode, decrypt and int() the result.

So for encryption:

>>> from Crypto.Cipher import AES
>>> import base64
>>> obj = AES.new('yoursecretkeyABC')
>>> x = base64.encodestring(obj.encrypt("%016d"%1))
>>> x
'tXDxMg1YGb1i0V29yCCBWg==\n'

and decryption

>>> int(obj.decrypt(base64.decodestring(x)))
1

If you can live with weak crypto, you could also simply xor the padded ID with a key:

>>> key = [33, 53, 2, 42]
>>> id = "%04d" % 1
>>> x = ''.join([chr(a^ord(b)) for a, b in zip(key, id)])
>>> x
'\x11\x052\x1b'
>>> int(''.join([chr(a^ord(b)) for a, b in zip(key, x)]))
1

But this is much less secure since you should never use the same OTP for multiple messages. Also make sure the key is the same length as your padded ID.

mensi
  • 9,580
  • 2
  • 34
  • 43
  • @StefanNch http://gitorious.org/bitcoin/python-base58/blobs/master/base58.py I think this is what I need to replace base64 with base58 – Korneel Jun 15 '12 at 10:20
  • Do you think the cost of this operation (base58 decrypt + AES decrypt) to convert the obfuscated ID back to the original ID is justified just for hiding the ID sequence? Could the decrypt operation be optimized at cost of the encrypt operation? – Korneel Jun 15 '12 at 10:28
  • 3
    AES is designed for high throughput and we're talking 16 byte strings. This should hardly be a performance issue. – mensi Jun 15 '12 at 10:29
  • Is it possible to make the resulting obfuscated string shorter? Maybe with another encryption algorithm? Youtube's ID's are only 11 characters, while this more than double in length. I would like to keep my URL's short. – Korneel Jun 15 '12 at 10:36
  • @Korneel: Most cryptographically secure ciphers have a minimal block size such as 16. If you are willing to relax on the security aspects, see the xor method I added to my answer. – mensi Jun 15 '12 at 11:09
  • Hasty Pudding cypher is designed with a variable block size, so that may suit your requirements better. – rossum Jun 15 '12 at 11:55
  • `Base58(AES(id))` is what I eventually used. It results in 22 character obfuscated IDs. I would prefer less characters, but that's rather complicated to achieve. – Korneel Jul 06 '12 at 10:43
1

This is an old Question that I stumbled upon. I recently found the hashids Library which solves this problem and is available for a wide range of programming languages:

http://hashids.org

Tim
  • 1,272
  • 11
  • 28
  • It does not solve the problem. It only allows to specify a min length. A fixed length is not possible. – T3rm1 Mar 03 '21 at 12:13
  • 1
    @T3rm1 you are obviously correct. If you want to allow for an arbitrary large id, you will eventually get larger strings. If you limit the string to a fixed length you would have to accept hash collisions for very large numbers which, in my opinion would be the worse outcome as it renders some ids useless. I would advise to set the min length to a size that easily covers the expected id range. – Tim Mar 08 '21 at 19:57