0

For some code that I want to replicate, I need to install tensorflow==1.15.4 with GPU support. Unfortunately, the pre-built binary is compiled with CUDA 10.0, but I have CUDA 10.2 on my system.

Thus, I wanted to install it from source and build it myself. I've followed these official instructions. During configure I selected always the default value except for Do you wish to build TensorFlow with CUDA support? [y/N]: which I answered with Y. I used the following build command:

bazel build --config=v1 --config=cuda //tensorflow/tools/pip_package:build_pip_package

I think the --config=cuda is redundant here, but I included it anyway to make sure.

I initially encountered an error during build, which I could resolve with this. After that, the compilation completed successfully.

To my surprise, running the following snippet after the installation indicates, that my GPU is still not available to use with tensorflow.

import tensorflow as tf


tf.test.is_built_with_cuda()  # True
tf.test.is_gpu_available()  # False

Can someone can tell me what I'm doing wrong here?

talonmies
  • 70,661
  • 34
  • 192
  • 269
Fugu_Fish
  • 131
  • 2
  • 10
  • Are you certain that the tensorflow you import and call is actually the one you built? – talonmies Dec 03 '20 at 10:39
  • @talonmies It seems so. Running `python -c "import tensorflow as tf; print('\n'.join(tf.__path__))"` resulted in [this output](https://gist.github.com/pmeier/eeff323b0b3fd4069d8cbc423964a74c). This is the library I installed the built wheel in. – Fugu_Fish Dec 03 '20 at 12:57
  • Found the issue: I had `CUDA_VISIBLE_DEVICES=1` as environment variable as indicated in [the sample code I'm trying to replicate](https://github.com/CompVis/adaptive-style-transfer#training). If I set this to `0` or unset it completely the GPU is recognized. Do you know the reason behind it? If so, could you create an answer so I can accept it? – Fugu_Fish Dec 03 '20 at 13:17
  • You can and should create you own answer. It is perfectly OK to answer your own question – talonmies Dec 03 '20 at 13:23
  • @talonmies I'm aware of that. But in that case the answer wouldn't include a reason on why the fix works. If you don't know either, I'm happy to answer it myself. – Fugu_Fish Dec 03 '20 at 13:26
  • You are asking me to answer something which you did involving details which are not even in your question. I couldn't do that even if I wanted to, which I don't – talonmies Dec 03 '20 at 13:28
  • Does this answer your question? [How to set specific gpu in tensorflow?](https://stackoverflow.com/questions/40069883/how-to-set-specific-gpu-in-tensorflow) – talonmies Dec 11 '20 at 05:27
  • @talonmies Kinda. With `CUDA_VISIBLE_DEVICES=1` I've hid my GPU with index 0 from `tensorflow` without knowing it. Of course I'm to blame here, since I used code without understanding it, but including a line like this in a "tutorial" seems dangerous at best. Thanks for your insights. – Fugu_Fish Dec 17 '20 at 09:35

1 Answers1

0

The code I was trying to replicate had CUDA_VISIBLE_DEVICES=1 set as environment variable. Out of inexperience with Tensorflow I also set this without understanding what I meant.

Since I have only a single GPU, i.e. index 0, my GPU was not recognized. Thus, this had nothing to do with the build, which worked as intended.

Fugu_Fish
  • 131
  • 2
  • 10