I have about 100 million simple key-value pairs(it's legacy data, never need to update, and keys are random string), and i want to store them in redis for query.
my thought is that i use the first four character as a hash key, and store them into a hash type, so there're about a million hash key in redis, with each hash key has about 1000 sub-keys.
but things just don't go as planed. for some reason, i found some hash keys only have one sub-key, but some have more than 500,000 sub-keys, which may not encoded in memory very efficiently.
so i'd like to know that is there are some simple understandable algorithm which can divide my 100 million string averagely into 100 thousand buckets(int). when I pick up a string, I can know where it goes by using the same algorithm.
thanks!!