I use a VM with tensorflow on google cloud.
The VM was created using the official google image https://console.cloud.google.com/marketplace/product/click-to-deploy-images/deeplearning?_ga=2.148488823.1903313271.1624440425-168625328.1576904373
It worked for few months but suddenly today I am getting an error "failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected"
What can the cause of the change ?
Is it possible that my VM was updated ? problem with google cloud ?
TF version: 2.4.1
GPU: 1 x NVIDIA Tesla T4
Update
My VM received an update, that probably cause the problem
Any Advice about the drivers I need to reinstall ?
Start-Date: 2021-06-24 05:28:25
Commandline: /usr/bin/unattended-upgrade
Upgrade: linux-compiler-gcc-8-x86:amd64 (4.19.181-1, 4.19.194-1)
End-Date: 2021-06-24 05:28:26
Start-Date: 2021-06-24 05:28:27
Commandline: /usr/bin/unattended-upgrade
Upgrade: libhogweed4:amd64 (3.4.1-1, 3.4.1-1+deb10u1), libnettle6:amd64 (3.4.1-1, 3.4.1-1+deb10u1)
End-Date: 2021-06-24 05:28:27
Start-Date: 2021-06-24 05:28:28
Commandline: /usr/bin/unattended-upgrade
Upgrade: shim-helpers-amd64-signed:amd64 (1+15+1533136590.3beb971+7+deb10u1, 1+15.4+5~deb10u1), shim-unsigned:amd64 (15+1533136590.3beb971-7+deb10u1, 15.4-5~deb10u1), shim-signed:amd64 (1.33+15+1533136590.3beb971-7, 1.36~1+deb10u1+15.4-5~deb10u1), shim-signed-common:amd64 (1.33+15+1533136590.3beb971-7, 1.36~1+deb10u1+15.4-5~deb10u1)
End-Date: 2021-06-24 05:28:32
Start-Date: 2021-06-24 05:28:33
Commandline: /usr/bin/unattended-upgrade
Upgrade: base-files:amd64 (10.3+deb10u9, 10.3+deb10u10)
End-Date: 2021-06-24 05:28:33
Start-Date: 2021-06-24 05:28:34
Commandline: /usr/bin/unattended-upgrade
Upgrade: libglib2.0-0:amd64 (2.58.3-2+deb10u2, 2.58.3-2+deb10u3)
End-Date: 2021-06-24 05:28:34
Start-Date: 2021-06-24 05:28:35
Commandline: /usr/bin/unattended-upgrade
Upgrade: libklibc:amd64 (2.0.6-1, 2.0.6-1+deb10u1), klibc-utils:amd64 (2.0.6-1, 2.0.6-1+deb10u1)
End-Date: 2021-06-24 05:28:35
Start-Date: 2021-06-24 05:28:36
Commandline: /usr/bin/unattended-upgrade
Install: linux-image-4.19.0-17-cloud-amd64:amd64 (4.19.194-1, automatic)
Upgrade: linux-image-cloud-amd64:amd64 (4.19+105+deb10u11, 4.19+105+deb10u12)
End-Date: 2021-06-24 05:28:45
Start-Date: 2021-06-24 05:28:45
Commandline: /usr/bin/unattended-upgrade
Upgrade: linux-libc-dev:amd64 (4.19.181-1, 4.19.194-1)
End-Date: 2021-06-24 05:28:46
Start-Date: 2021-06-24 05:28:47
Commandline: /usr/bin/unattended-upgrade
Upgrade: isc-dhcp-client:amd64 (4.4.1-2, 4.4.1-2+deb10u1)
End-Date: 2021-06-24 05:28:47
Start-Date: 2021-06-24 05:28:48
Commandline: /usr/bin/unattended-upgrade
Upgrade: libxml2:amd64 (2.9.4+dfsg1-7+deb10u1, 2.9.4+dfsg1-7+deb10u2)
End-Date: 2021-06-24 05:28:48
Start-Date: 2021-06-24 05:28:49
Commandline: /usr/bin/unattended-upgrade
Upgrade: libgcrypt20:amd64 (1.8.4-5, 1.8.4-5+deb10u1)
End-Date: 2021-06-24 05:28:49
Start-Date: 2021-06-24 05:28:50
Commandline: /usr/bin/unattended-upgrade
Upgrade: linux-kbuild-4.19:amd64 (4.19.181-1, 4.19.194-1)
End-Date: 2021-06-24 05:28:50
Start-Date: 2021-06-24 05:28:51
Commandline: /usr/bin/unattended-upgrade
Upgrade: libgnutls30:amd64 (3.6.7-4+deb10u6, 3.6.7-4+deb10u7)
End-Date: 2021-06-24 05:28:51
Update 2 Tried to update nvidia drivers using
sudo /opt/deeplearning/install-driver.sh
Now getting the error
cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
running nvidia-smi
yield
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 73C P0 24W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
and nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0