Store the results of a multiprocessing queue in python

Question

I'm trying to store the results of multiple API requests using multiprocessing queue as the API can't handle more than 5 connections at once.

I found part of a solution of How to use multiprocessing with requests module?

def worker(input_queue, stop_event):
    while not stop_event.is_set():
        try:
            # Check if any request has arrived in the input queue. If not,
            # loop back and try again.
            request = input_queue.get(True, 1)
            input_queue.task_done()
        except queue.Empty:
            continue
        print('Started working on:', request)
        api_request_function(request) #make request using a function I wrote

        print('Stopped working on:', request)


def master(api_requests):
    input_queue = multiprocessing.JoinableQueue()
    stop_event = multiprocessing.Event()
    workers = []
    # Create workers.
    for i in range(3):
        p = multiprocessing.Process(target=worker,
                                    args=(input_queue, stop_event))
        workers.append(p)
        p.start()

    # Distribute work.
    for requests in api_requests:
        input_queue.put(requests)

    # Wait for the queue to be consumed.
    input_queue.join()
    # Ask the workers to quit.
    stop_event.set()

    # Wait for workers to quit.
    for w in workers:
        w.join()

    print('Done')

I've looked at the documentation of threading and pooling but missing a step. So the above runs and all requests get a 200 status code which is great. But I do I store the results of the requests to use?

Thanks for your help Shan

score 0 · Answer 1 · answered Apr 26 '20 at 23:01

I believe you have to make a Queue. The code can be a little tricky, you need to read up on the multiprocessing module. In general, with multiprocessing, all the variables are copied for each worker, hence you can't do something like appending to a global variable. Since that will literally be copied and the original will be untouched. There are a few functions that already automatically incorporate workers, queues, and return values. Personally, I try to write my functions to work with mp.map, like below:

def worker(*args,**kargs):
    #do stuff
    return 'thing'
output = multiprocessing.Pool().map(worker,[1,2,3,4,5])

Thanks Bobby, i tried using a pool but I had an issue of the pool not waiting for the request to finish and making another request overloading the API. But I'll have another play — Shahnur Islam, Apr 26 '20 at 23:21

Store the results of a multiprocessing queue in python

1 Answers1