How to use Python multiprocessing queue to access GPU (through PyOpenCL)?

Question

I have code that takes a long time to run and so I've been investigating Python's multiprocessing library in order to speed things up. My code also has a few steps that utilize the GPU via PyOpenCL. The problem is, if I set multiple processes to run at the same time, they all end up trying to use the GPU at the same time, and that often results in one or more of the processes throwing an exception and quitting.

In order to work around this, I staggered the start of each process so that they'd be less likely to bump into each other:

process_list = []
num_procs = 4

# break data into chunks so each process gets it's own chunk of the data
data_chunks = chunks(data,num_procs)
for chunk in data_chunks:
    if len(chunk) == 0:
        continue
    # Instantiates the process
    p = multiprocessing.Process(target=test, args=(arg1,arg2))
    # Sticks the thread in a list so that it remains accessible
    process_list.append(p)

# Start threads
j = 1
for process in process_list:
    print('\nStarting process %i' % j)
    process.start()
    time.sleep(5)
    j += 1

for process in process_list:
    process.join()

I also wrapped a try except loop around the function that calls the GPU so that if two processes DO try to access it at the same time, the one who doesn't get access will wait a couple of seconds and try again:

wait = 2
n = 0
while True:
    try:
        gpu_out = GPU_Obj.GPU_fn(params)
    except:
        time.sleep(wait)
        print('\n Waiting for GPU memory...')
        n += 1
        if n == 5:
            raise Exception('Tried and failed %i times to allocate memory for opencl kernel.' % n)
        continue
    break

This workaround is very clunky and even though it works most of the time, processes occasionally throw exceptions and I feel like there should be a more effecient/elegant solution using multiprocessing.queue or something similar. However, I'm not sure how to integrate it with PyOpenCL for GPU access.

dano · Accepted Answer · 2015-04-13T19:23:23.560

15

Sounds like you could use a multiprocessing.Lock to synchronize access to the GPU:

data_chunks = chunks(data,num_procs)
lock = multiprocessing.Lock()
for chunk in data_chunks:
    if len(chunk) == 0:
        continue
    # Instantiates the process
    p = multiprocessing.Process(target=test, args=(arg1,arg2, lock))
    ...

Then, inside test where you access the GPU:

with lock:  # Only one process will be allowed in this block at a time.
    gpu_out = GPU_Obj.GPU_fn(params)

Edit:

To do this with a pool, you'd do this:

# At global scope
lock = None

def init(_lock):
    global lock
    lock = _lock

data_chunks = chunks(data,num_procs)
lock = multiprocessing.Lock()
for chunk in data_chunks:
    if len(chunk) == 0:
        continue
    # Instantiates the process
    p = multiprocessing.Pool(initializer=init, initargs=(lock,))
    p.apply(test, args=(arg1, arg2))
    ...

Or:

data_chunks = chunks(data,num_procs)
m = multiprocessing.Manager()
lock = m.Lock()
for chunk in data_chunks:
    if len(chunk) == 0:
        continue
    # Instantiates the process
    p = multiprocessing.Pool()
    p.apply(test, args=(arg1, arg2, lock))

edited Apr 13 '15 at 19:23

answered Apr 13 '15 at 19:00

dano

91,354
19
222
219

Wow! That's a lot simpler than I would have thought! I'm assuming that you would use `multiprocessing.Lock` the same way if you were using `multiprocessing.Pool` instead of `multiprocessing.Process`? – johnny_be Apr 13 '15 at 19:11
1

@johnny_be It's actually complicated a bit if you use a pool, because you can't normally pickle a `Lock`, so passing it to `pool.map`/`pool.apply` won't work. In that case, you either need to pass the lock to the `Pool` constructor: `Pool(initializer=init, initargs=(lock,)`, and then make the lock global inside the `init` function, or use a [`multiprocessing.Manager()`.lock](https://docs.python.org/2.7/library/multiprocessing.html#multiprocessing.managers.SyncManager.Lock), which can be pickled. See my edit for more complete examples. – dano Apr 13 '15 at 19:20
I don't suppose there's a way you could implement some sort of lock if you are just running the same process in multiple python consoles at the same time? (i.e. open command window, run `python my_script.py`, open another command window, run `python my_script.py` in the new window, etc.) – johnny_be Apr 13 '15 at 21:04
1

@johnny_be There's a way you can do it using `multiprocessing.managers.BaseManager`. See [this answer](http://stackoverflow.com/a/29423828/2073595) which shows how you could do it for a `Barrier`, rather than a `Lock`. The same idea applies, though; one instance of your script is a manager server, and the others are clients. You'd need to make sure the server instance doesn't exit until all the clients are done, though. – dano Apr 13 '15 at 21:11
Wow, thanks again! Another question: can you implement multiple different locks? For example, lets say I want to lock GPU functions so that multiple processes don't use the GPU at the same time, but I also want to lock disk write functions so that multiple processes don't try to write to the disk at the same time. There is no problem, however, if one process tries to access the GPU while another is accessing the disk, so I wouldn't want the same lock for both. Is that possible? Would that just be as simple as adding a second lock `lock2 = multiprocessing.Lock()`? – johnny_be Apr 13 '15 at 21:41
1

@johnny_be Yep, just create another lock instance. – dano Apr 13 '15 at 21:46

How to use Python multiprocessing queue to access GPU (through PyOpenCL)?

1 Answers1