80

What's the difference between using map and map_async? Are they not running the same function after distributing the items from the list to 4 processes?

So is it wrong to presume both are running asynchronous and parallel?

def f(x):
   return 2*x

p=Pool(4)
l=[1,2,3,4]
out1=p.map(f,l)
#vs
out2=p.map_async(f,l)
Right leg
  • 16,080
  • 7
  • 48
  • 81
aman
  • 1,875
  • 4
  • 18
  • 27
  • 5
    Doesn't `map` return only once the map is done (ie synchronously but in parallel), while `map_async` returns right away and allows the mapping to be done in the background (ie asynchronously and in parallel)? – Joachim Isaksson Mar 10 '16 at 06:37

1 Answers1

123

There are four choices to mapping jobs to processes. You have to consider multi-args, concurrency, blocking, and ordering. map and map_async only differ with respect to blocking. map_async is non-blocking where as map is blocking

So let's say you had a function

from multiprocessing import Pool
import time

def f(x):
    print x*x

if __name__ == '__main__':
    pool = Pool(processes=4)
    pool.map(f, range(10))
    r = pool.map_async(f, range(10))
    # DO STUFF
    print 'HERE'
    print 'MORE'
    r.wait()
    print 'DONE'

Example output:

0
1
9
4
16
25
36
49
64
81
0
HERE
1
4
MORE
16
25
36
9
49
64
81
DONE

pool.map(f, range(10)) will wait for all 10 of those function calls to finish so we see all the prints in a row. r = pool.map_async(f, range(10)) will execute them asynchronously and only block when r.wait() is called so we see HERE and MORE in between but DONE will always be at the end.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
quikst3r
  • 1,783
  • 1
  • 10
  • 15
  • 3
    ok so if i dont have other tasks to do beside executing the function f over the list, then map and map_async are same – aman Mar 10 '16 at 06:59
  • 16
    Not quite. You'll notice map will execute in order, but map_async doesn't – quikst3r Mar 10 '16 at 23:35
  • 2
    Should there be a `print 'DONE'` after `r.wait()`? – HBeel Nov 01 '16 at 14:32
  • Yes there should be! – quikst3r Nov 02 '16 at 19:32
  • 1
    If above example doesn't return different results for `map` and `map_async` on first run, try setting `range(500)` or something large. – webelo Oct 28 '17 at 15:22
  • https://stackoverflow.com/questions/35708371/purpose-of-pool-join-pool-close-in-python-multiprocessing – Jirka Oct 29 '18 at 03:32
  • Hello, will this work with this solution: https://stackoverflow.com/questions/5442910/python-multiprocessing-pool-map-for-multiple-arguments ? I am attempting to pass multiple args to the function. – ScipioAfricanus Feb 26 '19 at 12:30
  • 1
    @quikst3r: For me they both execute in random order. I change func f() to print(x), and the parameters: "pool.map(f, range(11,20)); r = pool.map_async(f, range(10))"; After run the code, it return: "11 12 13 15 16 17 18 14 19 HERE 0 MORE 2 1 3 4 5 6 7 8 9 DONE" – Chau Pham May 08 '19 at 16:59
  • If you don't see HERE and MORE and random positions you don't need to set range to a very high number and then keep scrolling for the prints. For me it was enough to set a `time.sleep(0.0002)` right before the print part And increase range a little bit to 100 – v.tralala Nov 24 '19 at 20:00
  • 4
    @Catbuilts Yeah, there is no difference in ordering between `map_async` and `map`, @quikst3r was mistaken when he said that. `map` is actually implemented internally as `map_async(...).get()`. – dano Feb 17 '21 at 18:07