I have a program that is is currently quite slow and CPU heavy, and I'm confused about how to parallelise it.
My Problem:
The algorithm works like this (with random numbers):
import numpy as np
import scipy.spatial as spatial
from collections import Counter
# Particle dictionary
num_parts = int(1e4)
particles = {'coords': np.random.rand(num_parts, 3),
'flags': np.random.randint(0, 4, size=num_parts)}
# Object dictionary
num_objects = int(1e3)
objects_dict = {'x': [np.random.rand(1)
for i in range(0, num_objects)],
'y': [np.random.rand(1)
for i in range(0, num_objects)],
'z': [np.random.rand(1)
for i in range(0, num_objects)],
'prop_to_calc': [np.array([])
for i in range(0, num_objects)]}
# Build KDTree for particles
tree = spatial.cKDTree(particles['coords'])
# Loop through objects to calculate 'prop_to_calc' based
# on nearby particles flags within radius, r.
r = 0.1
for i in range(0, num_objects):
# Find nearby indices of particles.
indices = tree.query_ball_point([objects_dict['x'][i][0],
objects_dict['y'][i][0],
objects_dict['z'][i][0]],
r)
# Extract flags of nearby particles.
flags_array = particles['flags'][indices]
# Find most common flag and store that in property_to_calculate
c = Counter(flags_array)
objects_dict['prop_to_calc'] = np.append(objects_dict['prop_to_calc'],
c.most_common(1)[0][0])
There are two data sets particles
and objects_dict
. I want to calculate objects_dict['prop_to_dict']
by searching nearby particles within radius, r
and finding their most common flag. This is done with cKDTree
and query_ball_point
.
For these numbers the time is:
10 loops, best of 3: 55.8 ms per loop
in Ipython 4.2.0
However, I want num_parts=1e6
and num_objects=1e5
, which results in some serious slow down.
My question:
As it is CPU heavy, I want to try to parallelise it to get some speed up.
I have looked at both multiprocessing
and multi threading
. However the docs confuse me quite a lot, and I'm not sure how to apply the examples to this problem.
Specifically, I'm concerned with how I share both dictionarys between processes and write into objects_dict
at the end.
Thanks for the help in advance.