1

when I run fit() with multiprocessing=True i always get a deadlock and the following warning:

WARNING:tensorflow:multiprocessing can interact badly with TensorFlow, causing nondeterministic deadlocks. For high performance data pipelines tf.data is recommended.

how to run it properly?

Since it says "tf.data", i wonder if transforming my data into this format will make multiprocessing work. What specifically is meant/how to convert it?

my dataset: (reproducable)

Input_shape, labels =(20,4), 6
LEN_X.LEN_Y = 20000.3000 
train_X,train_Y = np.asarray([np.random.random(Input_shape) for x in range(LEN_X )]), np.random.random((LEN_X ,labels))
validation_X,validation_Y = np.asarray([np.random.random(Input_shape) for x in range(LEN_Y)]), np.random.random((LEN_Y,labels))
sampleW = np.random.random((LEN_X ,1)) 
La-Li-Lu-Le-Low
  • 191
  • 2
  • 15

1 Answers1

-1

The multiprocessing doesn't accelerate the model itself. It only accelerates the data loading. And data loading delay is not a problem when all your data is already in-memory.

You could still use multiprocessing, however, but you must make sure that the underlying dataset is thread-safe and you have to carefully craft the data pipeline. That is quite time consuming. So, instead I suggest you speed up the model itself.

For that, you should look into:

  • changing all except last layer activations to RELU.
  • tweaking batch size. (optimal number depends on your hardware, and is almost always less than or equal to 32)
  • using Batch normalization to speed up convergence.
  • using higher learning rate (be careful not to overdo this step).
  • if you need faster convolutions, consider using Kaggle notebooks or vast.ai for GPU-enabled computations.
  • last but not least, try using a simpler, smaller model.

Comment down here if you have any additional questions.
Cheers.

tornikeo
  • 915
  • 5
  • 20
  • I've noticed that for Deep-Mind's AlphaZero (https://arxiv.org/pdf/1712.01815.pdf), they used mini-batches of size 4096. Is that just them flexing their computer power on us, or is it an exception to the <=32 rule? – Arkleseisure Sep 07 '20 at 14:45
  • @Arkleseisure that's a good question. I'm not sure but maybe they used the results of a [paper](https://arxiv.org/abs/1705.08741) that advocated a huge batch size in the very same year. Reinforcement learning is *very* hard to get right, we don't even know how to find correct batch sizes in general, yet. – tornikeo Sep 07 '20 at 14:54