1

I have a trained keras model, and I am trying to run predictions with CPU only. I want this to be as quick as possible, so I thought I would use predict_generator with multiple workers. All of the data for my prediction tensor are loaded into memory beforehand. Just for reference, array is a list of tensors, with the first tensor having shape [nsamples, x, y, nchannels]. I made a thread-safe generator following the instructions here (I followed this when using fit_generator as well).

class DataGeneratorPredict(keras.utils.Sequence):
    'Generates data for Keras'
    def __init__(self, array, batch_size=128):
        'Initialization'
        self.array = array
        self.nsamples = array[0].shape[0]
        self.batch_size = batch_size
        self.ninputs = len(array)
        self.indexes = np.arange(self.nsamples)

    def __len__(self):
        'Denotes the number of batches'
        print('nbatches:',int(np.floor(self.nsamples / self.batch_size)))
        return int(np.floor(self.nsamples / self.batch_size))

    def __getitem__(self, index):
        'Generate one batch of data'
        # Generate indexes of the batch
        print(index)
        inds = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        # Generate data
        X = []
        for inp in range(self.ninputs):
          X.append(self.array[inp][inds])

        return X

I run predictions with my model like so,

#all_test_in is my list of input data tensors
gen = DataGeneratorPredict(all_test_in, batch_size=1024)
new_preds = conv_model.predict_generator(gen,workers=4,use_multiprocessing=True)

but I don't get any speed improvement over using conv_model.predict, regardless of the number of workers. This seemed to work well when fitting my model (i.e., getting a speed-up using a generator with multiple workers). Am I missing something in my generator? Is there a more efficient way to optimize predictions (besides using GPU)?

Community
  • 1
  • 1
weather guy
  • 397
  • 1
  • 3
  • 17

1 Answers1

1

When you just call .predict, Keras already tries to use all available cores / predict in parallel the data points you give it. The predict generator with multiple workers might not add any benefit in this instance because each worker will need to wait for its turn to execute or share the available cores. Either way you end up getting the same performance.

Use of generators are more common if your data:

  • does not fit in memory. You can take batches at a time and predict rather than creating a large data array and call predict.
  • requires on the fly processing that might change / be random per batch.
  • cannot be stored easily in a NumPy array and has a different way of batching beyond slicing data points.
nuric
  • 11,027
  • 3
  • 27
  • 42
  • Thanks for the answer. Can you provide a link on keras .predict using all available cores? So is there no way to decrease prediction time except to use GPU (or get more CPUs)? – weather guy Oct 02 '19 at 20:11
  • It's actually not Keras does it, but Tensorflow, [it uses all cores](https://stackoverflow.com/questions/38836269/does-tensorflow-view-all-cpus-of-one-machine-as-one-device) by default. Yes, either GPU or more CPUs to speed it up. – nuric Oct 02 '19 at 20:21