12

System Info: 1.1.0, GPU, Windows, Python 3.5, code runs in ipython consoles.

I am trying to run two different Tensorflow sessions, one on the GPU (that does some batch work) and one on the CPU that I use for quick tests while the other works.

The problem is that when I spawn the second session specifying with tf.device('/cpu:0') the session tries to allocate GPU memory and crashes my other session.

My code:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = ""
import time

import tensorflow as tf

with tf.device('/cpu:0'):
  with tf.Session() as sess:
    # Here 6 GBs of GPU RAM are allocated.
    time.sleep(5)

How do I force Tensorflow to ignore the GPU?

UPDATE:

As suggested in a comment by @Nicolas, I took a look at this answer and ran

import os
os.environ["CUDA_VISIBLE_DEVICES"] = ""
import tensorflow as tf

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

which prints:

[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 2215045474989189346
, name: "/gpu:0"
device_type: "GPU"
memory_limit: 6787871540
locality {
  bus_id: 1
}
incarnation: 13663872143510826785
physical_device_desc: "device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0"
]

It seems to me that even if I explicitly tell the script to ignore any CUDA devices, it still finds and uses them. Could this be a bug of TF 1.1?

Community
  • 1
  • 1
GPhilo
  • 18,519
  • 9
  • 63
  • 89

2 Answers2

12

It turns out that setting CUDA_VISIBLE_DEVICES to the empty string does not mask the CUDA devices visible to the script.

From the documentation of CUDA_VISIBLE_DEVICES (emphasis added by me):

Only the devices whose index is present in the sequence are visible to CUDA applications and they are enumerated in the order of the sequence. If one of the indices is invalid, only the devices whose index precedes the invalid index are visible to CUDA applications. For example, setting CUDA_VISIBLE_DEVICES to 2,1 causes device 0 to be invisible and device 2 to be enumerated before device 1. Setting CUDA_VISIBLE_DEVICES to 0,2,-1,1 causes devices 0 and 2 to be visible and device 1 to be invisible.

It seems like the empty string used to be handled as "no valid devices exist" but changed meaning, as it is not mentioned in the documentation.

Changing the code to os.environ["CUDA_VISIBLE_DEVICES"] = "-1" fixes the problem. Running

import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"    
import tensorflow as tf

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

now prints

[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 14097726166554667970
]

and instantiating a tf.Session does not hog GPU memory anymore.

GPhilo
  • 18,519
  • 9
  • 63
  • 89
  • 1
    I took the liberty of aggregating (and quoting) various answers (including yours) in some documentation examples, see stackoverflow.com/documentation/tensorflow/10621 I hope you don't mind. Feel free to edit it. – pfm Jun 24 '17 at 13:36
2

Would you mind trying one of these config options ?

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
# or config.gpu_options.per_process_gpu_memory_fraction = 0.0
with tf.Session(config=config) as sess:
    ...

As per the documentation, it should help you manage your GPU memory for this particular session and so your second session should be able to run on GPU.

EDIT: according this answer you should also try this:

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"   # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
pfm
  • 6,210
  • 4
  • 39
  • 44
  • The `config.gpu_options.per_process_gpu_memory_fraction = 0.0` option is not working for me: it still tries to allocate memory on the GPU and kills the other session. (Interestingly enough, it's the other session the one that dies. The second one carries on..) – GPhilo Jun 13 '17 at 08:08
  • The `config.gpu_options.allow_growth = True` option seems to do the trick, although this is IMO quite confusing. `allow_growth` only turns off the preallocation of the GPU memory, but why is the memory preallocated at all when I'm disabling the CUDA device for the script? – GPhilo Jun 13 '17 at 08:18
  • When you said session, did you mean `tf.Session` as I create 2 `tf.Session` in one single python process: one for the GPU part the other for the CPU part? – pfm Jun 13 '17 at 09:22
  • Not exactly. It is another tf.Session, but in a second ipython console (and thus a separate process). When the second process tries to instantiate the Session instance, the first process crashes ("Kernel died") – GPhilo Jun 13 '17 at 09:24
  • I updated the question with some extra info I gathered from the question you linked in the (deleted?) comment – GPhilo Jun 13 '17 at 09:40
  • I deleted the comment and put it directly in an edit section of the answer. Also what if you add `os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" `? – pfm Jun 13 '17 at 09:41
  • Oh, I missed the edit, sorry. As per the code in the edit, that is actually where I started from (and that's where I came up with the second line in my test code). Stll, if I set `os.environ["CUDA_VISIBLE_DEVICES"]="0"` I see the GPU (which I guess is the expected behaviour because the GPU is in fact device 0). If however I try to hide it by setting `os.environ["CUDA_VISIBLE_DEVICES"]=""` I *still* see the GPU. – GPhilo Jun 13 '17 at 09:46
  • 1
    Ok I found the problem, I'll post it as an answer below this one – GPhilo Jun 13 '17 at 09:50