I have a short snippet making the pool.map
call to hang up. It does not return in any reasonable amount of time. The idea is to have a couple of point lists and find a measure for the correspondence to another point list.
import numpy as np
import multiprocessing
import scipy.spatial as ss
def dist(args): # returns the sum of distances of the nearest neighbors.
kdSet, points = args
dist, _ = kdSet.query(points)
return np.sum(dist)
#just define some points to play with
points = np.array([[13.27, 25.49], [13.18, 25.39], [13.08, 25.39],
[12.99, 25.39], [12.89, 25.39], [12.80, 25.39],
[12.71, 25.39], [12.61, 25.30], [12.52, 25.30]])
pointList = [points + idx for idx in range(3)] #creates a list of points
#create a list of KDTrees for fast lookup. Add some number to make them a little different from pointList
kdTreeList = [ss.KDTree(pointsLocal + idx/10) for idx, pointsLocal in enumerate(pointList)]
#this part works fine: serial execution
distances = list(map(dist, zip(kdTreeList, pointList)))
print(distances)
#this part hangs, if called as multiprocess, expected the same result as above
with multiprocessing.Pool() as pool:
print('Calling pool.map')
distancesMultiprocess = pool.map(dist, zip(kdTreeList, pointList)) #<-This call does not return, but uses the CPU 100%
print(distancesMultiprocess)
Calling the function sequentially it works fine. But if called in the pool.map function it neither gives an error nor it is returning anything. It just hangs there. The CPU load goes to 100%, so something is happening.
Removing the KDTree class for testing purposes did not show a different result.
Since there is no error thrown, I have no idea whats going wrong here. Any hint?