I am facing a weird issue for my laptop for sometime now. I have everything related to CUDNN installed as per Tensorflow site, both in Win 10 and Ubuntu/Elementary OS. I have a 4GB Nvidia GTX 1650 card.
I have noticed that whenever I try to run any sort of code where the final layer is a Conv2D layer (basically a fully connected network), I get an error saying, failed to load the convolution algorithm. This doesn't happen if the network has the final layer as a Dense layer. I have tried making just one layer and passing a random input just to verify this error.
I know my network works as I have trained and implemented the same network using Kaggle and Colab with GPU and they never gave any issues. I couldn't find any resources related to this.
The following error is when I pass a random array into a Dense layer
import numpy as np
import tensorflow.keras as k
images = np.random.randn(5, 10, 10, 3)
# layer = k.layers.Conv2D(2, 5)(images)
layer = k.layers.Dense(5)(images)
print(layer)
2020-11-17 00:33:47.134992: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-17 00:33:47.179260: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.179651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:33:47.179865: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:33:47.181080: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:33:47.182511: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:33:47.182833: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:33:47.184254: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:33:47.185050: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:33:47.187985: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:33:47.188197: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.188617: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.188935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:33:47.189223: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-17 00:33:47.195349: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2295785000 Hz
2020-11-17 00:33:47.195666: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f73e0000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:33:47.195692: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-17 00:33:47.274122: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.274556: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x563cb4ad1bb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:33:47.274574: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1650, Compute Capability 7.5
2020-11-17 00:33:47.274769: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.275092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:33:47.275143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:33:47.275157: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:33:47.275170: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:33:47.275182: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:33:47.275194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:33:47.275205: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:33:47.275218: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:33:47.275274: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.275634: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.275940: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:33:47.275995: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:33:47.276755: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-17 00:33:47.276772: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-11-17 00:33:47.276780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-11-17 00:33:47.276911: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.277264: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.277626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3550 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:Layer dense is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.
If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.
To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.
2020-11-17 00:33:47.654325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
tf.Tensor(
[[[[-6.94786727e-01 -1.97589189e-01 -4.67383981e-01 -4.58452135e-01
-4.69685793e-01]
[-1.77293643e-02 6.91280663e-01 1.45009533e-01 2.98871025e-02
7.64266670e-01]
[-2.06076771e-01 6.82765841e-01 8.81244838e-02 4.84802090e-02
7.67841816e-01]
...
The following error is when I try to pass the same array to a Conv2D layer
import numpy as np
import tensorflow.keras as k
images = np.random.randn(5, 10, 10, 3)
layer = k.layers.Conv2D(2, 5)(images)
# layer = k.layers.Dense(5)(images)
print(layer)
2020-11-17 00:37:02.993064: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-17 00:37:03.038038: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.038418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:37:03.038637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:37:03.039979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:37:03.041443: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:37:03.041761: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:37:03.043012: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:37:03.043827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:37:03.046690: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:37:03.046900: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.047315: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.047649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:37:03.047928: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-17 00:37:03.053663: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2295785000 Hz
2020-11-17 00:37:03.053980: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f5814000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:37:03.053999: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-17 00:37:03.143379: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.143830: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x559ecda6c460 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:37:03.143868: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1650, Compute Capability 7.5
2020-11-17 00:37:03.144190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.144838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:37:03.144923: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:37:03.144954: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:37:03.144983: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:37:03.145008: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:37:03.145033: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:37:03.145057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:37:03.145082: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:37:03.145212: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.145910: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.146501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:37:03.146568: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:37:03.148046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-17 00:37:03.148082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-11-17 00:37:03.148097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-11-17 00:37:03.148310: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.149016: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.149659: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3550 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:Layer conv2d is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.
If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.
To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.
2020-11-17 00:37:03.529234: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:37:03.902756: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-11-17 00:37:03.911356: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 921, in conv2d
_result = pywrap_tfe.TFE_Py_FastPathExecute(
tensorflow.python.eager.core._FallbackException: Expecting int64_t value for attr strides, got numpy.int32
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/kushagr/Documents/Python_tuts/Small_projects/Sudoku_solver/scratch_fcn.py", line 7, in <module>
layer = k.layers.Conv2D(2, 5)(images)
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
outputs = self.call(cast_inputs, *args, **kwargs)
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/keras/layers/convolutional.py", line 207, in call
outputs = self._convolution_op(inputs, self.kernel)
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 1106, in __call__
return self.conv_op(inp, filter)
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 638, in __call__
return self.call(inp, filter)
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 231, in __call__
return self.conv_op(
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 2006, in conv2d
return gen_nn_ops.conv2d(input, # pylint: disable=redefined-builtin
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 930, in conv2d
return conv2d_eager_fallback(
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1021, in conv2d_eager_fallback
_result = _execute.execute(b"Conv2D", 1, inputs=_inputs_flat, attrs=_attrs,
File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]
Nvidia-smi
The Cuda version may look like its not compatible with Tensorflow 2.x but it works fine with every model where the last layer isn't a Conv2D layer.
Any help would be appreciated. This is my first question on here. Please let me know if any more information would be helpful