2

I need to run son ML on my laptop, and I need the GPU for some dependencies issues related to a requirements.txt file. However, it turns out that PyTorch (which I need to be an older version, i.e. 1.7.0) cannot find any Cuda device, despite it being actually present and the Cuda toolkit has been installed.

PyTorch was installed through pip. I also tried to install PyTorch1.8.0 which has compatibility with Cuda <=11.1 drivers (the oldest I can install on my WSL), but nothing changed from what happens below.

I have installed NVidia drivers through this link, according to the documentation provided by NVIDIA.

GPU: GeForce RTX 1650Ti

Windows10 version: 21H2

WSL distro: ubuntu 20.04

$ uname -r 5.10.60.1-microsoft-standard-WSL2

(3.7.10/envs/python37cuda) ➜  ~ nvidia-smi
Fri Jan 21 23:11:00 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.00       Driver Version: 510.06       CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P8     5W /  N/A |    518MiB /  4096MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
(3.7.10/envs/python37cuda) ➜  ~ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

(3.7.10/envs/python37cuda) ➜  ~ python
Python 3.7.10 (default, Jan 21 2022, 16:08:33)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>>torch.cuda.is_available()
False

Please, note I tried with different versions of Cuda, namely 11.6, 11.1, and nothing changed. Why cannot it see the GPU and Cuda drivers are not available? Running nvidia-smi in PowerShell, however, it actually recognizes the drivers.

Moreover: lspci | grep NVIDIA returns nothing.

In addition, running docker run --rm --gpus=all nvidia/cuda:11.1-base nvidia-smi

Fri Jan 21 22:24:53 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.00       Driver Version: 510.06       CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   49C    P8     4W /  N/A |    501MiB /  4096MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

The docker container can see the GeForce GPU.

Whereas with the command: docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

Error: only 0 Devices available, 1 requested.  Exiting.

it cannot found anything.

Any hint to how to solve this issue and be able to?

EDIT:

Library and environment paths were both updated with the actual CUDA folder (i.e. in this case 11.1)

export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}} 

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/lib

Forgot to mention that, when in Powershell, nvidia-smi actually shows also the CUDA driver version.

EDIT: Just found out that with nvidia-smi.exe run in WSL2, it actually displays the CUDA Version, as if I were doing it in Powershell. Moreover:

➜  ~ ls -la /dev/dxg
crw-rw-rw- 1 root root 10, 63 Jan 21 22:21 /dev/dxg
Andrea Nicolai
  • 349
  • 1
  • 3
  • 12
  • You need to install the CUDA version that your torch install expects. You don't have a problem with your WSL/CUDA setup. Alternatively run torch [in a container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch). – Robert Crovella Jan 21 '22 at 22:40
  • pytorch 1.7 [seems to require,](https://pytorch.org/get-started/previous-versions/) at the highest, CUDA 11.0. You cannot use CUDA 11.1, CUDA 11.5, CUDA 11.6, or any other CUDA version, as a replacement for the version pytorch expects. – Robert Crovella Jan 21 '22 at 22:42
  • Thank you for your answer! I edited my OP. However, I tried to install CUDA 11.0 but could not find it in the repo for WSL distros. Therefore, to give it a try, I tried to install pytorch 1.8.0. which at least has compatibility with CUDA 11.1. Unfortunately, even doing so, nothing has changed. I thought about running it in a container but, as a dependence in the requirements.txt, there is matlab. For me it would be difficult to set it up there. – Andrea Nicolai Jan 21 '22 at 22:49
  • Is your LD_LIBRARY_PATH in the WSL environment set to properly point to your CUDA 11.1 install? – Robert Crovella Jan 21 '22 at 23:12
  • I should re-edit once more, pardon. Yes, I did both these: `export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}}` and `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/lib` – Andrea Nicolai Jan 21 '22 at 23:17
  • Does this answer your question? [Why \`torch.cuda.is\_available()\` returns False even after installing pytorch with cuda?](https://stackoverflow.com/questions/60987997/why-torch-cuda-is-available-returns-false-even-after-installing-pytorch-with) – Martin Zeitler Jan 22 '22 at 09:29
  • Unfortunately, it does not. Despite my GPU (which has compute capability of 7.5) isn't listed in that Wikipedia page, its drivers can still be downloaded from NVidia site. I'm trying now to downgrade them, to see whether version ~470 might be helpful. Moreover, when testing `torch.zeros(1).cuda`, it returns ` RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx `, which actually I do have. – Andrea Nicolai Jan 22 '22 at 09:51
  • I really cannot explain why, but it turns out that downgrading NVidia drivers to `Driver Version: 472.39` makes it working. – Andrea Nicolai Jan 22 '22 at 10:08

1 Answers1

1

The tricky thing with WSL is you could have multiple versions of python. Be it the distribution versions, windows version, or anaconda and really many others. So you need to ensure you are using the right version.

If you are using Ubuntu they have recommended steps for setting up CUDA. It is actually quite easy. Check here - https://ubuntu.com/tutorials/enabling-gpu-acceleration-on-ubuntu-on-wsl2-with-the-nvidia-cuda-platform#1-overview

But basically the steps are as follows

sudo apt-key del 7fa2af80
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/3bf863cc.pub
sudo add-apt-repository 'deb https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/ /'
sudo apt-get update
sudo apt-get -y install cuda

Basically you do not want to use the default cuda version provided by your distribution. It needs to match what Windows has installed.

Now you could compile their test application to see if CUDA is working like so.

git clone https://github.com/nvidia/cuda-samples
cd cuda-samples/Samples/1_Utilities/deviceQuery
make
./deviceQuery

Also I should add using pytorch website to download their latest stable version also works. You should go to their website and not copy this as it is probably old depending on when you are seeing this post. pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Goddard
  • 2,863
  • 31
  • 37