When you list
-ify the map
, that means every single request is dispatched serially, waits for completion, then stores to the resulting list
. If you're dispatching 1000 requests, that means each request must complete in order, one by one, before the list
is constructed and you see the first result; it's entirely synchronous.
You get results (almost) immediately in the direct map
iteration case because it only makes one request at a time; instead of waiting for 1000 requests, it waits for 1, you process that result, then it waits for another, etc.
If the goal is to minimize latency, take a look at multiprocessing.Pool.imap
(or the thread based version of the pool implemented in multiprocessing.dummy
; threads can be ideal for parallel network I/O requests and won't require pickling data for IPC). With the Pool
's map
, imap
, or imap_unordered
methods (choose one based on your needs), the requests will be dispatched asynchronously, several at a time (depending on the number of workers you select). If you absolutely must have a list
, Pool.map
will usually construct it faster; if you can iterate directly and don't care about the ordering of results, Pool.imap_unordered
will get you results as fast as the workers can get them, in whatever order they are satisfied in. Plain map
without a Pool
isn't getting you any magical performance benefits (a list comprehension would usually run faster actually), so use a Pool
.
Simple example code for fastest results:
import multiprocessing.dummy as multiprocessing # Import thread based version of library; for network I/O should work fine
with multiprocessing.Pool(8) as pool: # Pool of eight worker threads
for item in pool.imap_unordered(lambda product: client.request(units_url + "/" + product), units):
print(item)
If you really need to, you can use Pool.map
and store to a real list
, and assuming you have the bandwidth to run eight parallel requests (or however many workers you configure the pool for), that should (roughly) divide the time to complete the map
by eight.