tldr version:
I have a long encoded json payload that I store on redis as a key. I would like to know if hashing the key before storing will improve lookup performance and which hashing algorithm is recommended (I'm considering md5/sha1).
p/s i'm using python for code
other notes:
- ttl for key is short (30 secs) hence I don't care about hash collision
- I only need to check if key exists in redis
long story version:
I have a stream of transactions in json that are encoded in protobuf flowing to my application via a message queue at a high rate. I run worker nodes that read the data from the queue and process the data. However I realized that there were instances that duplicates were being sent.
my solution was to store the data in a "global" cache (redis) where my workers would check before attempting to process. as the flow rate is high, decoding the data and reading it is expensive hence i'm storing the strings whole.
transactions expire after 30s so i use a ttl of 30s.
therefore i'm wondering if hashing the strings before storing them would be a good idea as i only need to check for existance
thanks for reading