Parallelizing loop in python

Question

I have an agents population of 10,000 and each agent has its location [x,y], where x and y are random numbers between 0 and 100. I want to calculate distance between each agent and decide if they are neighbours and then store these information in 10000x10000 array. So far I've come up with the following:

for j in range(len(agents_population[1])):
    B[j,j:10000] = [
        True if distance.euclidean(agents_population[1][j],
            agents_population[1][i]) < r 
        else False for i in range(j,len(agents_population[1]))
      ]

Where agents_population[1] is a list of agents' coordinates (so in this case 10000-items list of 2-items lists: [[x1,y1], [x2,y2]...), r is radius and B is that array. In each iteration I calculate distance between agent j and agents j+1 to 9999. However, each iteration of the loop is independent from each other, so I was thinking I could use all four cores of my processor instead of just one. I tried using multiprocessing module and Pool().map() function, but I wasn't able to make it work. I would be really grateful for any advice.

EDIT:

I tried using multiprocessin as follows:

def worker(j):
    B[j,j:10000] = [
        True if distance.euclidean(agents_population[1][j],
            agents_population[1][i]) < r 
        else False for i in range(j,len(agents_population[1]))
      ]

def mp_handler():
    p = Pool(4)
    p.map(worker, range(0))

if __name__ == '__main__':
    mp_handler() 

B[0,].tofile('foo.csv',sep=',')

I run it via cmd.exe, but when I open file foo.csv the results are different compared to when I run function worker with j=0. Am I doing something wrong?

Not really an answer but if you're using python2, using xrange instead of range would save you creating 10,000 unused lists. — Holloway, Nov 23 '16 at 14:04
http://stackoverflow.com/questions/15143837/how-to-multi-thread-an-operation-within-a-loop-in-python — Jaroslaw Matlak, Nov 23 '16 at 14:04
Possible duplicate of [Dead simple example of using Multiprocessing Queue, Pool and Locking](http://stackoverflow.com/questions/20887555/dead-simple-example-of-using-multiprocessing-queue-pool-and-locking) — Jean-François Fabre, Nov 23 '16 at 14:24
Using multiple threads or processes to process data provides no guarantees on the order of the results. You will need to sort it after the `Pool.map` if order matters to you. — RegularlyScheduledProgramming, Nov 23 '16 at 15:34
@Jean-FrançoisFabre @JaroslawMatlak I tried both `concurrent.futures` and `ThreadPool()`, they work, but it doesn't seem they parallelize anything, i.e. they compute similarly to the regular for-loop or even a bit slower :( — Wojtek Piechocki, Nov 23 '16 at 18:01

Parallelizing loop in python

0 Answers0