multiprocessing for keras model predict with single GPU

Question

Background

I want to predict pathology images using keras with Inception-Resnet_v2. I have trained the model already and got a .hdf5 file. Because the pathology image is very large (for example: 20,000 x 20,000 pixels), so I have to scan the image to get small patches for prediction.

I want to speed up the prediction procedure using multiprocessing lib with python2.7. The main idea is using different subprocesses to scan different lines and then sending patches to model.

I saw somebody suggests importing keras and loading model in subprocesses. But I don't think it is suitable for my task. Loading model usingkeras.models.load_model() one time will take about 47s, which is very time-consuming. So I can't reload the model every time when I start a new subprocess.

Question

My question is can I load the model in my main process and pass it as a parameter to subprocesses?

I have tried two methods but both of them didn't work.

Method 1. Using multiprocessing.Pool

The code is :

import keras
from keras.models import load_model
import multiprocessing

def predict(num,model):
    print dir(model)
    print num
    model.predict("image data, type:list")

if __name__ == '__main__':
    model = load_model("path of hdf5 file")
    list = [(1,model),(2,model),(3,model),(4,model),(5,model),(6,model)]
    pool = multiprocessing.Pool(4)
    pool.map(predict,list)
    pool.close()
    pool.join()

The output is

cPickle.PicklingError: Can't pickle <type 'module'>: attribute lookup __builtin__.module failed

I searched the error and found Pool can't map unpickelable parameters, so I try method 2.

Method 2. Using multiprocessing.Process

The code is

import keras
from keras.models import load_model
import multiprocessing

def predict(num,model):
    print num
    print dir(model)
    model.predict("image data, type:list")

if __name__ == '__main__':
    model = load_model("path of hdf5 file")
    list = [(1,model),(2,model),(3,model),(4,model),(5,model),(6,model)]
    proc = []
    for i in range(4):
        proc.append(multiprocessing.Process(predict, list[i]))
        proc[i].start()
    for i in range(4):
        proc[i].join()

In Method 2, I can print dir(model). I think it means the model is passed to subprocesses successfully. But I got this error

E tensorflow/stream_executor/cuda/cuda_driver.cc:1296] failed to enqueue async memcpy from host to device: CUDA_ERROR_NOT_INITIALIZED; GPU dst: 0x13350b2200; host src: 0x2049e2400; size: 4=0x4

The environment which I use:

Ubuntu 16.04, python 2.7
keras 2.0.8 (tensorflow backend)
one Titan X, Driver version 384.98, CUDA 8.0

Looking forward to reply! Thanks!

Have you ever solved this problem? Facing the same pickling problem here. Using pure Process instead of a Pool made the process hang instead of failing to pickle. However I am not sure if that is a progress at all. — Eduardo, Oct 27 '18 at 16:40

score 0 · Answer 1 · answered Dec 09 '17 at 08:52

0

Maybe you can use apply_async() instead of Pool()

and you can find more details here:

Python multiprocessing pickling error

answered Dec 09 '17 at 08:52

Statham

4,000
2
32
45

score 0 · Answer 2 · answered Oct 25 '19 at 03:13

Multi-processing works on CPU, while model prediction happened in GPU, which there is only one. I cannot see how multi-processing can help you on prediction.

Instead, I think you can use multi-processing to scan different patches, which you seems to have already managed to achieve. Then stack these patches into a batch or batches to predict in parallel in GPU.

score 0 · Answer 3 · answered Oct 11 '21 at 15:46

As noted by Statham multiprocess requires all args to be compatible with pickle. This blog post describes how to save a keras model as a pickle: [http://zachmoshe.com/2017/04/03/pickling-keras-models.html][1] It may be a sufficient workaround to get your keras model passed as an arg to multiprocess, but I have not tested the idea myself.

I will also add that I had better luck running two keras processes on a single gpu using windows rather than linux. On linux I was getting out of memory errors on the 2nd process, but the same memory allocation (45% of total GPU ram for each) worked on windows. In my case they were fits - for running predictions only, maybe the memory requirements are less.

multiprocessing for keras model predict with single GPU

3 Answers3