Specifying which 3 GPUs to use (among 4 on a machine) using Keras and Tensorflow

Question

I am performing deep learning on my machine which has 4 GPU's. During training, the third GPU is consistently lost (the error comes up "GPU lost" and the logs indicate it's this specific GPU). I am assuming it's a thermal issue and the GPU is becoming unseated.

Before I fix this hardware issue, I would like to continue using the 3 GPUs ('/gpu:0', '/gpu:1', '/gpu:3'). Is there a way to specific, in Keras, that these are the GPUs I want to use (or alternatively, ignore '/gpu:2')?

I have seen a lot on specifying GPU vs CPU usage and specifying one GPU on a multiple GPU machine but not this specific issue (isolated a number of specific GPUs).

Does this answer your question? [How do I select which GPU to run a job on?](https://stackoverflow.com/questions/39649102/how-do-i-select-which-gpu-to-run-a-job-on) — Daraan, Dec 03 '22 at 14:28

score 2 · Accepted Answer · answered Mar 20 '19 at 15:40

2

You can try to use CUDA_VISIBLE_DEVICES environ

import os
os.environ['CUDA_VISIBLE_DEVICES']="0,1,3"

Probably set this before importing keras/tf.

answered Mar 20 '19 at 15:40

nickyfot

1,932
17
25

Specifying which 3 GPUs to use (among 4 on a machine) using Keras and Tensorflow

1 Answers1