0

I have a problem which is trivially parallelizeable: I need to perform the same operation on 24 cdef objects. I know I could use multiprocess for this, but copying the data/starting a new process takes as long as just doing the computation serially so nothing is gained. Therefore, openMP might be a better alternative.

The operation I'd like to do would look like this with multiprocessing:

multiprocess.map(f, list_of_cython_objects)

Could something like the below work? Why/why not? I understand I'd have to create an array of pointers to the cython objects, I cannot use lists.

from cython.parallel import prange, threadid

with nogil, parallel():

    for i in prange(len(list_of_cython_objects), schedule='guided'):
        f(list_of_cython_objects[i])
The Unfun Cat
  • 29,987
  • 31
  • 114
  • 156
  • If the work itself doesn't invoke the GIL a multithreading approach (no copying to workers) should work fine. But I have never used such an approach in Cython (only pure Python). https://stackoverflow.com/a/45365174/4045774 – max9111 Aug 29 '18 at 13:53
  • Here is some ideas that might work : In your case (multiple execution of the same function), you could use a `multiprocessing.Pool`. Secondly, you want to avoid copying your data, so you could work with `multiprocessing.sharedctypes`, which allow memory mapped object – CoMartel Aug 31 '18 at 13:19
  • @CoMartel that is an interesting idea. Will look into it! :) – The Unfun Cat Sep 03 '18 at 10:21

1 Answers1

2

Provided that the majority of f can be done without the GIL (i.e. it uses cdef attributes of the cdef class) then this can be made to work pretty well. The only bit that needs the GIL is indexing the list, and you can easily put that in a with gil block.

An illustrative example is:

from cython.parallel import prange, parallel

cdef class C:
    cdef double x

    def __init__(self,x):
        self.x = x

cdef void f(C c) nogil:
    c.x *= 2


def test_function():
    list_of_cython_objects = [ C(x) for x in range(20) ]
    cdef int l = len(list_of_cython_objects)
    cdef int i
    cdef C c
    with nogil, parallel():
        for i in prange(l, schedule='guided'):
            with gil:
                c = list_of_cython_objects[i]
            f(c)

So long as the with gil blocks are small (in terms of the proportion of computation time) then you should get most of the parallelization speed-up that you expect.

DavidW
  • 29,336
  • 6
  • 55
  • 86