Peculiar Keras issue with regards to Fully connected networks

Question

I am facing a weird issue for my laptop for sometime now. I have everything related to CUDNN installed as per Tensorflow site, both in Win 10 and Ubuntu/Elementary OS. I have a 4GB Nvidia GTX 1650 card.

I have noticed that whenever I try to run any sort of code where the final layer is a Conv2D layer (basically a fully connected network), I get an error saying, failed to load the convolution algorithm. This doesn't happen if the network has the final layer as a Dense layer. I have tried making just one layer and passing a random input just to verify this error.

I know my network works as I have trained and implemented the same network using Kaggle and Colab with GPU and they never gave any issues. I couldn't find any resources related to this.

The following error is when I pass a random array into a Dense layer

import numpy as np
import tensorflow.keras as k

images = np.random.randn(5, 10, 10, 3)

# layer = k.layers.Conv2D(2, 5)(images)
layer = k.layers.Dense(5)(images)

print(layer)

2020-11-17 00:33:47.134992: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-17 00:33:47.179260: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.179651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:33:47.179865: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:33:47.181080: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:33:47.182511: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:33:47.182833: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:33:47.184254: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:33:47.185050: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:33:47.187985: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:33:47.188197: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.188617: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.188935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:33:47.189223: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-17 00:33:47.195349: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2295785000 Hz
2020-11-17 00:33:47.195666: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f73e0000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:33:47.195692: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-11-17 00:33:47.274122: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.274556: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x563cb4ad1bb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:33:47.274574: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1650, Compute Capability 7.5
2020-11-17 00:33:47.274769: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.275092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:33:47.275143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:33:47.275157: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:33:47.275170: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:33:47.275182: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:33:47.275194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:33:47.275205: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:33:47.275218: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:33:47.275274: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.275634: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.275940: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:33:47.275995: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:33:47.276755: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-17 00:33:47.276772: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-11-17 00:33:47.276780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-11-17 00:33:47.276911: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.277264: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:33:47.277626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3550 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:Layer dense is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

2020-11-17 00:33:47.654325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
tf.Tensor(
[[[[-6.94786727e-01 -1.97589189e-01 -4.67383981e-01 -4.58452135e-01
    -4.69685793e-01]
   [-1.77293643e-02  6.91280663e-01  1.45009533e-01  2.98871025e-02
     7.64266670e-01]
   [-2.06076771e-01  6.82765841e-01  8.81244838e-02  4.84802090e-02
     7.67841816e-01]
   ...

The following error is when I try to pass the same array to a Conv2D layer

import numpy as np
import tensorflow.keras as k

images = np.random.randn(5, 10, 10, 3)

layer = k.layers.Conv2D(2, 5)(images)
# layer = k.layers.Dense(5)(images)

print(layer)

2020-11-17 00:37:02.993064: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-17 00:37:03.038038: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.038418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:37:03.038637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:37:03.039979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:37:03.041443: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:37:03.041761: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:37:03.043012: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:37:03.043827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:37:03.046690: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:37:03.046900: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.047315: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.047649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:37:03.047928: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-17 00:37:03.053663: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2295785000 Hz
2020-11-17 00:37:03.053980: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f5814000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:37:03.053999: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-11-17 00:37:03.143379: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.143830: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x559ecda6c460 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-17 00:37:03.143868: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1650, Compute Capability 7.5
2020-11-17 00:37:03.144190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.144838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1650 computeCapability: 7.5
coreClock: 1.56GHz coreCount: 16 deviceMemorySize: 3.82GiB deviceMemoryBandwidth: 119.24GiB/s
2020-11-17 00:37:03.144923: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:37:03.144954: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-17 00:37:03.144983: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-17 00:37:03.145008: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-17 00:37:03.145033: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-17 00:37:03.145057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-17 00:37:03.145082: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:37:03.145212: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.145910: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.146501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-11-17 00:37:03.146568: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-17 00:37:03.148046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-17 00:37:03.148082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-11-17 00:37:03.148097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-11-17 00:37:03.148310: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.149016: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-17 00:37:03.149659: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3550 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:Layer conv2d is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

2020-11-17 00:37:03.529234: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-17 00:37:03.902756: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-11-17 00:37:03.911356: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 921, in conv2d
    _result = pywrap_tfe.TFE_Py_FastPathExecute(
tensorflow.python.eager.core._FallbackException: Expecting int64_t value for attr strides, got numpy.int32

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kushagr/Documents/Python_tuts/Small_projects/Sudoku_solver/scratch_fcn.py", line 7, in <module>
    layer = k.layers.Conv2D(2, 5)(images)
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/keras/layers/convolutional.py", line 207, in call
    outputs = self._convolution_op(inputs, self.kernel)
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 1106, in __call__
    return self.conv_op(inp, filter)
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 638, in __call__
    return self.call(inp, filter)
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 231, in __call__
    return self.conv_op(
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 2006, in conv2d
    return gen_nn_ops.conv2d(input,  # pylint: disable=redefined-builtin
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 930, in conv2d
    return conv2d_eager_fallback(
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1021, in conv2d_eager_fallback
    _result = _execute.execute(b"Conv2D", 1, inputs=_inputs_flat, attrs=_attrs,
  File "/home/kushagr/anaconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

Nvidia-smi

The Cuda version may look like its not compatible with Tensorflow 2.x but it works fine with every model where the last layer isn't a Conv2D layer.

Any help would be appreciated. This is my first question on here. Please let me know if any more information would be helpful

Welcome to stackoverflow! Posting code as text is considered bad practice. Could you edit your question and replace the images by text? — Lescurel, Nov 16 '20 at 16:16
Thanks. I've tried to fix the issues with the post as per your feedback. I hope its ok now. — Kushagr Goyal, Nov 16 '20 at 19:10
Does this answer your question? [Failed to get convolution algorithm. This is probably because cuDNN failed to initialize,](https://stackoverflow.com/questions/53698035/failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-in) — Lescurel, Nov 17 '20 at 09:06
No, I've gone through that post earlier. I don't have memory issues or incompatibility issues. I actually, today again, uninstalled and reinstalled CUDA and CUDNN correct versions based on tensorflow site. — Kushagr Goyal, Nov 17 '20 at 20:32

Peculiar Keras issue with regards to Fully connected networks

0 Answers0