8

I have Ubuntu 18.04. Python 3.7.3, Tensorflow 2.0.0

here's my cuda version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

My computer is UX430UQ, graphic card is GeForce 940MX

Here's the output from nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01    Driver Version: 418.87.01    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 940MX       On   | 00000000:01:00.0 Off |                  N/A |
| N/A   45C    P0    N/A /  N/A |    283MiB /  2004MiB |      9%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1014      G   /usr/lib/xorg/Xorg                            24MiB |
|    0      1164      G   /usr/bin/gnome-shell                          47MiB |
|    0      1440      G   /usr/lib/xorg/Xorg                           123MiB |
|    0      1615      G   /usr/bin/gnome-shell                          84MiB |
+-----------------------------------------------------------------------------+

Here's the output when I run sudo apt-get install cuda:

Reading package lists...
Building dependency tree...
Reading state information...
cuda is already the newest version (10.1.243-1).
0 upgraded, 0 newly installed, 0 to remove and 138 not upgraded.

Here's the output when I run tf.test.is_gpu_available()

2019-10-08 21:04:37.186069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2019-10-08 21:04:37.188434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.2415

pciBusID: 0000:01:00.0

2019-10-08 21:04:37.188863: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64

2019-10-08 21:04:37.189156: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64

2019-10-08 21:04:37.189426: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64

2019-10-08 21:04:37.189687: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64

2019-10-08 21:04:37.189946: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64

2019-10-08 21:04:37.190202: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64

2019-10-08 21:04:37.190236: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2019-10-08 21:04:37.190244: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.

Skipping registering GPU devices...

2019-10-08 21:04:37.190261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2019-10-08 21:04:37.190268: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0

2019-10-08 21:04:37.190276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N

talonmies
  • 70,661
  • 34
  • 192
  • 269
yew onn
  • 81
  • 1
  • 1
  • 3
  • 1
    Let's see: 1 Your installed TF wants CUDA 10.0: `Could not load dynamic library 'libcudart.so.10.0'`, 2. Your installed CUDA version appears to be 9.1: `Cuda compilation tools, release 9.1, V9.1.85`, and 3. your `LD_LIBRARY_PATH` is pointing to CUDA 8.0: `LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64` Much of this can probably be sorted out if you do a careful job of installing CUDA. The CUDA 10.0 linux install guide is [here](https://docs.nvidia.com/cuda/archive/10.0/cuda-installation-guide-linux/index.html). You should start by installing CUDA 10.0 correctly. – Robert Crovella Oct 08 '19 at 13:32
  • Hi, the nvidia-smi shows that I already installed Cuda 10.1 but somehow the nvcc - - version is 9.1.85. I just include the output of nvidia-smi as an edit – yew onn Oct 08 '19 at 14:09
  • 1
    nvidia-smi doesn't tell you the installed CUDA version the way you think it does, see [here](https://stackoverflow.com/questions/53422407/different-cuda-versions-shown-by-nvcc-and-nvidia-smi). Anyway, if you have actually installed CUDA 10.1 properly, then you just need to set the `PATH` and `LD_LIBRARY_PATH` variables correctly to use it, which is covered in the CUDA linux install guide I already linked (step 7). However your TF is expecting CUDA 10.0 and **you can't use CUDA 10.1 as a substitute/replacement for CUDA 10.0 for TF**. – Robert Crovella Oct 08 '19 at 14:33
  • I just did the two steps: 1) `export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}` and 2) `export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}`. The output for `echo $LD_LIBRARY_PATH` is `/usr/local/cuda-10.1/lib64` but the tf.test.is_gpu_available() is still saying `LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64`. Also, it says `cuda is already the newest version (10.1.243-1).` when I run `sudo apt-get install cuda`. (I just edited this into the question). What should I do about the CUDA 10.1 vs CUDA 10.0 for TF? – yew onn Oct 08 '19 at 15:33
  • After running `export PATH=/usr/local/cuda-10.1/bin:/usr/local/cuda-10.1/NsightCompute-2019.1${PATH:+:${PATH}} ` , `nvcc --version` now shows `Cuda compilation tools, release 10.1, V10.1.243`. However, running `tf.test.is_gpu_available()` is still saying `LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64` even when the output of echo $LD_LIBRARY_PATH is `/usr/local/cuda-10.0/lib64` – yew onn Oct 08 '19 at 15:54
  • If you want to install CUDA 10.0, you can do `sudo apt-get install cuda-10-0` TF apparently has some environment like conda that it is picking up the wrong `LD_LIBRARY_PATH` from. – Robert Crovella Oct 08 '19 at 16:45

1 Answers1

-1

You should use cuda10 and cudnn7.4 referring to this web

DachuanZhao
  • 1,181
  • 3
  • 15
  • 34