I would like to do the following:
- read data from a csv file
- process each line of said csv (assuming this is a long network operation)
- write to another file the result
I have tried gluing together this and this answers, but with scarce success. The code for the second queue never gets called, therefore there is no writing to disk happening. How do I let the process know there is a second queue?
Note that I am not necessary a fan of multiprocessing
. If async
/await
works better, I am all for it.
My code so far
import multiprocessing
import os
import time
in_queue = multiprocessing.Queue()
out_queue = multiprocessing.Queue()
def worker_main(in_queue, out_queue):
print (os.getpid(), "working")
while True:
item = in_queue.get(True)
print (os.getpid(), "got", item)
time.sleep(1) #long network processing
print (os.getpid(), "done", item)
# put the processed items to be written to disl
out_queue.put("processed:" + str(item))
pool = multiprocessing.Pool(3, worker_main,(in_queue,out_queue))
for i in range(5): # let's assume this is the file reading part
in_queue.put(i)
with open('out.txt', 'w') as file:
while not out_queue.empty():
try:
value = q.get(timeout = 1)
file.write(value + '\n')
except Exception as qe:
print ("Empty Queue or dead process")