So, I'm working on an application that has to check ~50 GB of data against a list of hashes every time it starts up. Obviously this needs to be parallelized, and I don't want the application hanging on a "LOADING..." screen for a minute and a half.
I'm using multiprocessing.Pool
's map_async
to handle this; the main thread calls map_async(checkfiles, path_hash_pairs, callback)
and provides a callback that tells it to throw up a warning if mismatches are found.
Trouble is... nothing happens. Looking at the Python processes with my task manager shows they spawn and then immediately terminate without doing any work. They never print anything and certainly never finish and call the callback.
This minified example also exhibits the same problem:
def printme(x):
time.sleep(1)
print(x)
return x**2
if __name__ == "__main__":
l = list(range(0,512))
def print_result(res):
print(res)
with multiprocessing.Pool() as p:
p.map_async(printme, l, callback=print_result)
p.join()
time.sleep(10)
Run it, and... nothing happens. Swapping map_async
for map
works exactly as expected.
Am I just making a stupid mistake or what?