I want to test a github for my work:
https://github.com/tufts-ml/GAN-Ensemble-for-Anomaly-Detection
so I did a
git clone https://github.com/tufts-ml/GAN-Ensemble-for-Anomaly-Detection
Unfortunately, I have an error when I do the command
sh experiments/run_mnist_en_fanogan.sh
(from the github README)
sh experiments/run_mnist_en_fanogan.sh 1 ✘
/home/svetlana/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:106: UserWarning:
NVIDIA GeForce RTX 3080 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3080 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
/home/svetlana/.local/lib/python3.9/site-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:180.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
Traceback (most recent call last):
File "/home/svetlana/Documents/git/GAN-Ensemble-for-Anomaly-Detection/train.py", line 30, in <module>
main()
File "/home/svetlana/Documents/git/GAN-Ensemble-for-Anomaly-Detection/train.py", line 24, in main
model.train()
File "/home/svetlana/Documents/git/GAN-Ensemble-for-Anomaly-Detection/models/f_anogan.py", line 155, in train
self.gan_training(epoch)
File "/home/svetlana/Documents/git/GAN-Ensemble-for-Anomaly-Detection/models/f_anogan.py", line 93, in gan_training
fake_imgs = self.net_Gds[i_G](z)
File "/home/svetlana/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/svetlana/Documents/git/GAN-Ensemble-for-Anomaly-Detection/models/networks.py", line 175, in forward
output = self.main(input)
File "/home/svetlana/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/svetlana/.local/lib/python3.9/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/svetlana/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/svetlana/.local/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 916, in forward
return F.conv_transpose2d(
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution
I thought my installation is ok but now I have doubts. This is my installation:
Python 3.9.6 (default, Jun 30 2021, 10:22:16)
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Wed_Jul_14_19:41:19_PDT_2021
Cuda compilation tools, release 11.4, V11.4.100
Build cuda_11.4.r11.4/compiler.30188945_0
import torch
print(torch.__version__)
1.9.0+cu102
I installed cudnn-11.4 from nvidia website (https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html),I don't know the command to check the version, I tried this one:
cat /opt/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
but it returns nothing
I tried solutions found here: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize,
without succes (to show VRAM, I used nvtop
)