I am using a dictionary structure in python to store (row, col) from a large numpy array. The length of the dictionary object is almost 100,000
The key is a tuple of (row,col). Some of the sample values in this structure are:
OrderedDict([((1783, 586), 0), ((1783, 587), 1), ((1783, 588), 2), ((1783, 589), 3), ((1783, 590), 4), ((1784, 584), 5), ((1784, 585), 6), ((1784, 586), 7), ((1784, 587), 8), ((1784, 588), 9), ((1784, 589), 10), ((1784, 590), 11), ((1784, 591), 12), ((1784, 592), 13), ((1784, 593), 14), ((1784, 594), 15), ((1784, 595), 16), ((1785, 583), 17), ((1785, 584), 18), ((1785, 585), 19), ((1785, 586), 20), ((1785, 587), 21), ((1785, 588), 22), ((1785, 589), 23), ((1785, 590), 24), ((1785, 591), 25), ((1785, 592), 26), ((1785, 593), 27), ((1785, 594), 28), ((1785, 595), 29), ((1785, 596), 30), ((1785, 597), 31),...
The processing is taking forever for lookups using the key.
I perform a lookup using (row,col):
if (1783,586) in keyed_var_pixels:
Based on this post, using in keyword for a dict object should use hashing. For each of the lookup, it seems to take around 0.02 seconds, and a total of 30 mins if running for the entire dataset. This seems too long for a hashed retrieval. I am wondering how I can improve this runtime? Or any alternative data structure to store these values for fast retrieval and existence check.
Thanks in advance!