I need to run son ML on my laptop, and I need the GPU for some dependencies issues related to a requirements.txt file. However, it turns out that PyTorch (which I need to be an older version, i.e. 1.7.0) cannot find any Cuda device, despite it being actually present and the Cuda toolkit has been installed.
PyTorch was installed through pip. I also tried to install PyTorch1.8.0 which has compatibility with Cuda <=11.1 drivers (the oldest I can install on my WSL), but nothing changed from what happens below.
I have installed NVidia drivers through this link, according to the documentation provided by NVIDIA.
GPU: GeForce RTX 1650Ti
Windows10 version: 21H2
WSL distro: ubuntu 20.04
$ uname -r
5.10.60.1-microsoft-standard-WSL2
(3.7.10/envs/python37cuda) ➜ ~ nvidia-smi
Fri Jan 21 23:11:00 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.00 Driver Version: 510.06 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| N/A 48C P8 5W / N/A | 518MiB / 4096MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(3.7.10/envs/python37cuda) ➜ ~ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
(3.7.10/envs/python37cuda) ➜ ~ python
Python 3.7.10 (default, Jan 21 2022, 16:08:33)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>>torch.cuda.is_available()
False
Please, note I tried with different versions of Cuda, namely 11.6, 11.1, and nothing changed. Why cannot it see the GPU and Cuda drivers are not available? Running nvidia-smi
in PowerShell, however, it actually recognizes the drivers.
Moreover:
lspci | grep NVIDIA
returns nothing.
In addition, running
docker run --rm --gpus=all nvidia/cuda:11.1-base nvidia-smi
Fri Jan 21 22:24:53 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.00 Driver Version: 510.06 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| N/A 49C P8 4W / N/A | 501MiB / 4096MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
The docker container can see the GeForce GPU.
Whereas with the command:
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Error: only 0 Devices available, 1 requested. Exiting.
it cannot found anything.
Any hint to how to solve this issue and be able to?
EDIT:
Library and environment paths were both updated with the actual CUDA folder (i.e. in this case 11.1)
export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}}
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/lib
Forgot to mention that, when in Powershell, nvidia-smi
actually shows also the CUDA driver version.
EDIT:
Just found out that with nvidia-smi.exe
run in WSL2, it actually displays the CUDA Version, as if I were doing it in Powershell.
Moreover:
➜ ~ ls -la /dev/dxg
crw-rw-rw- 1 root root 10, 63 Jan 21 22:21 /dev/dxg