I'm running code which uses 16 processes to build up 16 dictionaries of length approximately 62,500 (that's about 1,000,000 in total). After each process finishes I update a single dictionary like this:
main_dict.update(sub_dict)
I'm finding that my code seems to hang a lot near the end of the whole script (around when I'd expect some of my processes to start returning their sub_dict
s). So I'm suspecting the dictionary update.
Supposedly update needs to check every key of the sub_dict
against those of the main_dict
so with my example that could mean up to 62500*937500 checks for the last update, right?
Am I on the right track here? And if so, is there a way to speed things up? I know the keys are going to be unique and there will never be overlap between the sub_dict
s so maybe that helps.