Basically I'm trying to make my Django + PostgreSQL database perform better.
Say I have a table with first_name
, last_name
, state
, address
. For whatever reason I want first_name
and last_name
together to make my primary key. Doesn't have to make sense, it's just an example.
At first I used zlib.adler32()
to generate a hash from the two strings and put it into a BigIntegerField
and use that as my primary key, but I quickly discovered that the purpose of this function is totally different and I was getting collisions pretty quickly.
Currently I'm using hashlib.md5()
to generate a hash into a 32 character CharField
and use it as my primary key:
hashlib.md5(bytes(f'{first_name}{last_name}', encoding='utf-8')).digest().hex()
However things slowed down a lot. I don't know if it's because of the md5 algorithm, or because of the table having a string for primary key instead of an integer.
I know there is another option - unique_together
value of the Meta
class, but I've been trying to avoid any Django conveniences for performance reasons.
How would you recommend to make a primary key? I'm looking for performance, doesn't need to be human readable. If BigInteger
is much faster as a primary key, how do I generate a 20 digit int from a variable length string?