4

I've installed a handful of PyTorch versions (CUDA 11.7 nightly, CUDA 11.6 nightly, 11.3), but every time, torch.version.cuda returns 10.2.

I'd like to run PyTorch on CUDA 11.7. My graphics card has CUDA capability sm_86.

[me@legion imagen-test]$ sudo pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113 
...
[me@legion imagen-test]$ python
>>> import torch
>>> print(torch.version.cuda)
10.2

When I actually try to use PyTorch, I get an error saying the PyTorch version I have installed doesn't support the newer version of CUDA my graphics card requires.

>>> torch.Tensor([1,2,3]).cuda()
...
NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
...
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

I'm completely stumped, and unsure where to go from here. I'd appreciate any help.

talonmies
  • 70,661
  • 34
  • 192
  • 269
functorial
  • 330
  • 3
  • 13
  • Does this answer your question? [Why are torch.version.cuda and deviceQuery reporting different versions?](https://stackoverflow.com/questions/69497328/why-are-torch-version-cuda-and-devicequery-reporting-different-versions) – talonmies May 30 '22 at 00:04
  • do not rely on `pip` databases, they are typically 1-2 library generations behind the cuda library; first you install nvidia driver bundle + cuda (>2.2 GB installer); then download cudnn package and install manually; and in the very end, you use `pip3 install pytorch==22.04` – ivan866 May 30 '22 at 00:05
  • torch.cuda.version is hard coded string set at build time. You can’t change it, it is the version that Pytorch was compiled with – talonmies May 30 '22 at 00:06
  • @functorial you should understand that executing the pip command does not install any cuda at all; and does not link cuda with your pytorch anyhow; you need to install everything in reverse order - first the driver and the cuda (their versions are strictly dependent on each other); then `cudnn`, and then the pytorch – ivan866 May 30 '22 at 00:07

2 Answers2

2

You've probably installed PyTorch with CUDA 10.2 among your different installed versions. This may be taking priority over the versions of PyTorch. To fix this, simply uninstall all versions of PyTorch with pip uninstall torch -y and reinstall PyTorch with CUDA 11.7.

Source: https://discuss.pytorch.org/t/cuda-version-is-always-10-2/152876

Caeden
  • 376
  • 1
  • 6
-3

i m gonna describe everything for installing tensorflow with gpu support; i assume it is very similar for installing pytorch
you should keep all those libraries and their version numbers STRICT: nvidia driver 510; cuda11.6; cudnn8.4.0; cupti11.6
then install strict version of pytorch 22.04 (a build with gpu support)
check that \CUDA\bin; \CUDA\libnvvp; \CUDA\extras\CUPTI\lib64 is in your $PATH (this applies for Windows only)
check that the CUDA folder is in your CUDA_PATH and CUDA_PATH_V11_6 envvars

in the very end, you check your installation is working inside python:

import tensorflow as tf
tf.test.is_gpu_available()

a correct version of tf should print out, correct versions of cuda and cupti; and the second line should successfully run and test the gpu

ivan866
  • 554
  • 4
  • 10