5

i've worked with tensorflow for a while and everything worked properly until i tried to switch to the gpu version.

Uninstalled previous tensorflow, pip installed tensorflow-gpu (v2.0) downloaded and installed visual studio community 2019 downloaded and installed CUDA 10.1 downloaded and installed cuDNN

tested with CUDA sample "deviceQuery_vs2019" and got positive result. test passed Nvidia GeForce rtx 2070

run test with previous working file and get the error tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: cudaGetErrorString symbol not found.

after some research i've found that the supported CUDA version is 10.0 so i've downgraded the version, changed the CUDA path, but nothing changed

using this code


import tensorflow as tf
print("Num GPUs Available: ", 
len(tf.config.experimental.list_physical_devices('GPU')))

i get

2019-10-01 16:55:03.317232: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-10-01 16:55:03.420537: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
Num GPUs Available:  1
name: GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
2019-10-01 16:55:03.421029: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-01 16:55:03.421849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
[Finished in 2.01s]

CUDA seems to recognize the card, so does tensorflow, but i cannot get rid of the error: tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: cudaGetErrorString symbol not found.

what am i doing wrong? should i stick with cuda 10.0? am i missing a piece of the installation?

talonmies
  • 70,661
  • 34
  • 192
  • 269
m4l4
  • 131
  • 1
  • 1
  • 6
  • https://www.tensorflow.org/install/gpu says (implicitly) that CUDA 10.0 is required and I have seen elsewhere (github?) instructions to use 10.0 and not 10.1. Unfortunately, I have 10.0 and I have exactly the same error, so I too shall be interested in a usefully explanatory answer. – Julian Moore Oct 03 '19 at 13:04

3 Answers3

6

SOLVED, it's mostly an alchemy of versions to avoid conflicts. Here's what i've done (order matters as far as i know)

  1. uninstall everything (tf, cuda, visual studio)
  2. pip install tensorflow-gpu
  3. download and install visual studio community 2017 (2019 won't work)
  4. I also have installed the c++ workload from visual studio (not sure if it's necessary but it has the required compiler visual c++ 15.x)
  5. download and install cuda 10.0 (the one i have is 10.0.130)
  6. go to system environment variables (search it in the windows bar) > advanced > click Environment Variables...
  7. create New user variables (do not confuse with system var)
  8. Variable name: CUDA_PATH,
  9. Variable value: browse to the cuda directory down to the version directory (mine is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0)
  10. the guide says you need cudnn 7.4.1, but i got an error about expected version being 7.6 minimum. go to the nvidia developers cudnn archive and download "cudnn v7.6.0 for CUDA 10.0" (be sure you get the right file). unzip, put the cudnn files into the corresponding cuda directories (lib, include, bin).

From there everything worked like a charm. I haven't been able to build the cuda sample file from visual studio (devicequery) but it's not a vital step. Almost every error was due to incompatible versions of the files, took me 3-4 days to figure the right mix. Hope that help :)

m4l4
  • 131
  • 1
  • 1
  • 6
  • you saved my life. I spent whole day to get this working without success. What I don't understand is why tensorflow 2.0 even needs a lower cuda and cudnn version to operate. Another pc of mine uses tensorflow 1.14 with cuda 10.1. But 2.0 needs cuda 10.0. No clue why! But thank you!!! – purpleblau Mar 18 '20 at 18:19
  • @m4l4 if we use anaconda , it can be solved very easily? – BirdANDBird Apr 14 '20 at 11:22
6

tensorflow-gpu v2.0.0 is now available on conda, and is very easy to install with: conda install -c anaconda tensorflow-gpu. No additional downloads or cuda installs required.

LSgeo
  • 321
  • 3
  • 6
  • I should note, the official TensorFlow installation instructions explicitly say not to do this. They recommend using conda to organise your environment, but to then install using pip. See https://www.tensorflow.org/install/pip By default GPU is supported with the standard tensorflow install. You need to specifically install *-cpu to be restricted. I now believe using conda-forge is the best way to install the latest version of tensorflow: https://anaconda.org/conda-forge/tensorflow – LSgeo Oct 13 '22 at 04:21
0

i had similar problems. combined with the fact that i am using windows 8 and pycharm. BUt i figured it out eventually using this post.

the combination that worked:

  • Cuda 10
  • CuDNN 7.6 for windows7
  • Tensorflow-gpu 2.0
  • then using the path environment variable as described above.

Important is to restart after setting environment variables ;)

i did not think that tensorflow 2.2. would not be able to use cuda 11...

busssard
  • 158
  • 1
  • 1
  • 11