I only have one GPU (Titan X Pascal, 12 GB VRAM) and I would like to train multiple models, in parallel, on the same GPU.
I tried encapsulated my model in a single python program (called model.py), and I included code in model.py to restrict VRAM usage (based on this example). I was able to run up to 3 instances of model.py concurrently on my GPU (with each instance taking a little less than 33% of my VRAM). Mysteriously, when I tried with 4 models I received an error:
2017-09-10 13:27:43.714908: E tensorflow/stream_executor/cuda/cuda_dnn.cc:371] coul
d not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2017-09-10 13:27:43.714973: E tensorflow/stream_executor/cuda/cuda_dnn.cc:338] coul
d not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2017-09-10 13:27:43.714988: F tensorflow/core/kernels/conv_ops.cc:672] Check failed
: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNon
fusedAlgo<T>(), &algorithms)
Aborted (core dumped)
I later observed on the tensorflow Github that people seem to think that it is unsafe to have more than one tensorflow process running per GPU. Is this true, and is there an explanation for why this is the case? Why was I able to have 3 tensorflow processes running on the same GPU and not 4?