Short version: Tensorflow Keras model is training much slower on Tesla T4 than on GTX 970 (both are working, checked with nvidia-smi
).
Long version: I have two host machines. One is PC with GTX 970 with Tensorflow 2.1.0
, the other is GCP AI Platform Notebook with Tesla T4 with Tensorflow 2.3.0
. I am running the same code on both of them, all training data is stored in RAM as numpy arrays, dtype
is the same ('float16'), batch_size
is also the same (8, GTX wouldn't work with anything above that, but I have also tried 64 on Tesla, didn't make any difference). Why is Tesla, with twice computing power of GTX, working about 2.5 times slower? How can I use my GPU correctly to train my models faster?