Sorry if this question seems overly noob to you. I've taken programming courses but never one on computer architecture. I had to pretty much learn from Wiki/SO/Google.
I have a dict
called LUT
, and I need to parallelize its lookup (READ-ONLY). I have a list
of item
s that I am scattering to multiple threads/processes, and each thread/process will then lookup LUT[item]
for each item
in its respective chopped-up list.
I can only think of 7 options to achieve this:
1
. multithreading
module, all threads lookup the same dict
2
. multiprocessing
module, all processes lookup the same dict
3
. multiprocessing
module, all processes lookup their own copy of dict
, e.g. if there are 2 processes, there are 2 copies of the dict
4
. multiprocessing
module, all processes lookup a "shared proxy dict": Manager.dict
The following 3 options use Cython
since I've heard it can be used to overcome Python's GIL.
5
. Cython & C++'s STL unordered_map
and multithreading
, all threads lookup the same unordered_map
6
. Cython & C++'s STL unordered_map
and multiprocessing
, all processes lookup the same unordered_map
7
. Cython & C++'s STL unordered_map
and multiprocessing
, all processes lookup their respective copy of the unordered_map
I have already tried options 2, 3, & 4
. 2 & 4
are around 100-1000x slower than serial lookup. Option 3
works well, but its memory usage is too high, since it makes use of multiple copies of the dictionary.
Options 5, 6, & 7
use Cython
and its ability to extend with C++'s STL unordered_map
, which is the C++-equivalent of Python's dict
. Option 5
should technically overcome Python's GIL, but I am wondering if multithreading can really solve something that is CPU-bound. What is my best bet here?