I have about 20 million key-value pairs . I need to create two dictionaries.
First dictionary: The values are ints, from 0 to 20 million. The keys are strings of length 40 characters, for example '36ae99662ec931a3c20cffdecb39b69a8f7f23fd'.
Second dictionary: Reverse of the first dictionary. The keys are ints, from 0 to 20 million. The values are strings of length 40 characters, for example '36ae99662ec931a3c20cffdecb39b69a8f7f23fd'.
I think for the second dictionary, there are more options, since the index can just be used as the key. For the second option, it looks like sqlite3 is promising.
Lookup speed is not too important, 1 second look up should be okay. The main concern is I don't have too much space to store the dictionary.
As for my best guess for the first type of dictionary, From this SO post
*large* python dictionary with persistence storage for quick look-ups
It looks like dbm would be a decent solution for the first type of dictionary since all the keys and values are stored as bytes, though the answer was given 7 years ago in 2012. I am not sure if it is a decent solution today.