1

I am having a bit of trouble with Parallel processing in Python. I am completely new to the concept of Parallel computing. I use multiprocessing that comes with standard Python.

I have 12 threads in my computer. I ask for 12 workers, but I am not always able to get all workers I ask for. My problem arises when I don't get access to as many workers I need to process nTasks number of tasks in my code below (currently set to four). What happens then is just that the code gets stuck and never gets to what is under the comment "# Get Results". It seems random how many workers I get (I always ask for 12), but the problem arises when I get three workers or less in the following code:

import multiprocessing as mp
import scipy as sp
import scipy.stats as spstat
import pylab

def testfunc(x0, N):
   print 'working with x0 = %s' % x0
   x = [x0]
   for i in xrange(1,N):
       x.append(spstat.norm.rvs(size = 1)) # stupid appending to make it slower
       if i % 10000 == 0:
          print 'x0 = %s, i = %s' % (x0, i)
   return sp.array(x)

def testfuncParallel(fargs):
    return testfunc(*fargs)

pool = mp.Pool(12) # I have 12 threads

nTasks = 4
N = 100000

tasks = [(x, n) for x, n in enumerate(nTasks*[N])] # nTasks different tasks

result = pool.map(testfuncParallel, tasks)
pool.close()
pool.join()

# Get results:
sim = sp.zeros((N, nTasks)) 

for nn, res in enumerate(result):    
    sim[:, nn] = res

pylab.figure()
for i in xrange(nTasks):
    pylab.subplot(nTasks,1, i + 1)
    pylab.plot(sim[:, i])

pylab.show()

I have tried to use pool.map_async instead of pool.map but I cannot get around the problem.

Thanks in advance,

Sincerely,

Matias

matiasq
  • 537
  • 1
  • 6
  • 12
  • this seems to work fine for me -- what versions of multiprocessing, scipy and pylab are you using? – Noah Aug 01 '11 at 15:18
  • why are you asking for 12 processes but only submitting 4 tasks? – Noah Aug 01 '11 at 15:21
  • 1
    [also, pool.close() and pool.join() are only necessary with the map_async() version] – Noah Aug 01 '11 at 15:23
  • Are you using Windows? If so, you'll need an `if __name__=="__main__"` check before you create the pool. Otherwise each process will try to create a new pool, and it all blows up. – Thomas K Aug 01 '11 at 16:38
  • @Noah:I have Enthoughts 64 bit distribution of Python 2.7. The code sometimes works for me too, it seems randomly. It is only when I can't get access to four workers that it does not work. I just picked the number four randomly, I have tried 12 as well as 6 and 8. Good to know that pool.close() and pool.join() only are necessary with the map_async() version. Thank you so much for your time. – matiasq Aug 01 '11 at 20:05
  • @Thomas K: I have ubuntu so I don't run into problems with that. Thank you for your answer. – matiasq Aug 01 '11 at 20:06

0 Answers0