I'm using Keras for a sliding window object detection system. This naturally requires the ability to do many, many classifications quickly. Unfortunately, Keras's model.predict()
function has a significant overhead and takes longer to load? preprocess the data? who knows, than it does to do the actual network processing. I know because I've tried removing layers, etc. and it makes almost no difference to the time spent in a model.predict()
call.
So basically what I'm looking for is a way to use one network and run predictions on several inputs at once. Not necessarily in separate threads, but without returning to my code. Is anyone aware of such a technique?