1

I am trying to speed up my python script, which uses vtk methods (and vtkobjects) for processing of geometric measurements. Since some of my methods include looping over very similar meshes and computing enclosed points for each of them, I simply wanted to parallelise such for loops:

averaged_contained_points = []

for intersection_actor in intersection_actors:
    contained_points = vtk_mesh.points_inside_mesh(point_data=point_data, mesh=intersection_actor.GetMapper().GetInput())
    mean_pos = np.mean(contained_points, axis=0)
    averaged_contained_points.append(mean_pos)

In this case the function vtk_mesh.points_inside_mesh calls vtk.vtkSelectEnclosedPoints() and takes a vtkActor and vtkPolyData as input.

The main question is: How can this be converted to run in parallel?

My initial attempt was to import multiprocessing, but I then switched to import pathos.multiprocessing, which seems to have a few advantages, but they work fairly similar.

The problem is that the code below doesn't work.

def _parallel_generate_intersection_avg(inputs):
    point_data = inputs[0]
    intersection_actor = inputs[1]

    contained_points = vtk_mesh.points_inside_mesh(point_data=point_data, mesh=intersection_actor.GetMapper().GetInput())

    if len(contained_points) is 0:
        return np.array([-1,-1,-1])

    return np.mean(contained_points, axis=0)

pool = ProcessingPool(CPU_COUNT)
inputs = [[point_data,intersection_actor] for intersection_actor in intersection_actors]
averaged_contained_points = pool.map(_parallel_generate_intersection_avg, inputs)

It results in these sort of errors:

pickle.PicklingError: Can't pickle 'vtkobject' object: (vtkPolyData)0x111ed5bf0

I have done some research and found that vtkobjects probably can't be pickled:

Can't pickle <type 'instancemethod'> when using python's multiprocessing Pool.map()

However, since I couldn't find a solution for running python vtk code in parallel with the available answers, please let me know if you have any suggestions.

[EDIT]

I didn't try to implement threading, mainly, because I read the comments to the answer in this thread: How do I parallelize a simple Python loop?

Using multiple threads on CPython won't give you better performance for pure-Python code due to the global interpreter lock (GIL)

Community
  • 1
  • 1
Chris
  • 3,245
  • 4
  • 29
  • 53
  • Tried threading? https://docs.python.org/2/library/threading.html – Stiffo Aug 12 '15 at 11:10
  • @Stiffo Thanks, I though someone would suggest this, but I expected to run into the same problems with threading and was a little more complicated to implement. So I haven't tried, yet. – Chris Aug 12 '15 at 11:36

1 Answers1

2

Unlike with threading, to pass arguments to a multiprocessing Process the argument must be able to be serialized using pickle.

example:

def functionWithPickableInput(inputstring0):
        r0 = vtk.vtkPolyDataReader()
        r0.ReadFromInputStringOn()
        r0.SetInputString(inputstring0 )
        r0.Update()
        polydata0 = r0.GetOutput()
        return functionWithVtkInput(polydata0)
    #compute the strings to use as input (they are the content of the correspondent vtk file)



vtkstrings = []


w = vtk.vtkPolyDataWriter()
w.WriteToOutputStringOn()
for mesh in meshes:
       w.SetInputData(mesh)
       w.Update()
       w.WriteToOutputStringOn() 
       vtkstrings.append(w.GetOutputString())

Here I chose to write everything in memory (see methods in http://www.vtk.org/doc/nightly/html/classvtkDataReader.html#a122da63792e83f8eabc612c2929117c3, http://www.vtk.org/doc/nightly/html/classvtkDataWriter.html#a8972eec261faddc3e8f68b86a1180c71 ). Of course, you will have to call the writer outside the parallel loop, so you will have to judge if the overhead of the writer is reasonable respect to the function you want to parallelize. You can also read your polydata from a file, if you have ram problems.

lib
  • 2,918
  • 3
  • 27
  • 53
  • 1
    Actually thinking about it the last link says that you can use a vtkarray class derived from numpy ndarray, that will also make the object pickable for multiprocessing – lib Aug 13 '15 at 06:15
  • 1
    And probably pickle is going to write to file, if so it's not worth to write a string, have pickle read it and write it to a file, just use the polydata writer. Later I will update my answer with an exapmple using vtk dataset_adapter http://www.kitware.com/blog/home/post/709 and multiprocessing (pickle) – lib Aug 13 '15 at 07:46
  • Building on this answer I created an example that uses multiprocessing and queues to generate VTK shapes in child processes which are then added to a plot. You can play with how many spheres get generated per job and how many jobs to kick off at a time. Spheres colored based on which worker created it. https://gist.github.com/flutefreak7/41eb05be858e511a683ad7f1e188c29c If people wanted to play with alternate forms of serialization besides vtkPolyDataWriter, this might provide a nice little demo / benchmark platform. – flutefreak7 Oct 18 '18 at 04:52