UnknownError: Failed to get convolution algorithm

Question

Complete Error :

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

command used for package installation :

conda install -c anaconda keras-gpu

It installed :

tensorboard 2.0.0 pyhb38c66f_1
tensorflow 2.0.0 gpu_py37h57d29ca_0
tensorflow-base 2.0.0 gpu_py37h390e234_0
tensorflow-estimator 2.0.0 pyh2649769_0
tensorflow-gpu 2.0.0 h0d30ee6_0 anaconda
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
keras-applications 1.0.8 py_0
keras-base 2.2.4 py37_0
keras-gpu 2.2.4 0 anaconda
keras-preprocessing 1.1.0 py_1

I have tried installing cuda-toolkit from nvidia website it did not resolved the issue so do suggest related to conda commands.

Some blogs suggesting installing visual studio but what is the need if I have spyder IDE ?

Code :

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Convolution2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense

classifier = Sequential()

classifier.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))

classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Convolution2D(32, 3, 3, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Flatten())

classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))

classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 4,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 4,
                                            class_mode = 'binary')

classifier.fit_generator(training_set,
                         steps_per_epoch = 8000,
                         epochs = 25,
                         validation_data = test_set,
                         validation_steps = 2000)

After executing code below I am getting error :

classifier.fit_generator(training_set,
                             steps_per_epoch = 8000,
                             epochs = 25,
                             validation_data = test_set,
                             validation_steps = 2000)

edit 1 : Traceback

Traceback (most recent call last):

  File "D:\Machine Learning\Machine Learning A-Z Template Folder\Part 8 - Deep Learning\Section 40 - Convolutional Neural Networks (CNN)\cnn.py", line 70, in <module>
    validation_steps = 2000)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 1297, in fit_generator
    steps_name='steps_per_epoch')

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 265, in model_iteration
    batch_outs = batch_function(*batch_data)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 973, in train_on_batch
    class_weight=class_weight, reset_metrics=reset_metrics)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 264, in train_on_batch
    output_loss_metrics=model._output_loss_metrics)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 311, in train_on_batch
    output_loss_metrics=output_loss_metrics))

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 252, in _process_single_batch
    training=training))

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 127, in _model_loss
    outs = model(inputs, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 891, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\sequential.py", line 256, in call
    return super(Sequential, self).call(inputs, training=training, mask=mask)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 708, in call
    convert_kwargs_to_constants=base_layer_utils.call_context().saving)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 860, in _run_internal_graph
    output_tensors = layer(computed_tensors, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 891, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\layers\convolutional.py", line 197, in call
    outputs = self._convolution_op(inputs, self.kernel)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 1134, in __call__
    return self.conv_op(inp, filter)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 639, in __call__
    return self.call(inp, filter)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 238, in __call__
    name=self.name)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 2010, in conv2d
    name=name)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py", line 1031, in conv2d
    data_format=data_format, dilations=dilations, name=name, ctx=_ctx)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py", line 1130, in conv2d_eager_fallback
    ctx=_ctx, name=name)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)

  File "<string>", line 3, in raise_from

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

Also check weather `tf.test.is_gpu_available()` returns `True` or `False`. — Vivek Mehta, Jan 23 '20 at 07:50
@ Vivek Mehta I have added the traceback , please check the 'edit 1' — CHETAN RAJPUT, Jan 23 '20 at 13:37
What is output of `tf.test.is_gpu_available()`? If it is False there is issue with your installation. If it is True, GPU is available to use but not sufficient memory is available, in that case check if other processing are eating up your GPU memory. — Vivek Mehta, Jan 23 '20 at 13:59
@Vivek Mehta yes output is True , I ahe checked nothing is running in background as my gpu usage is 0% Are my versions compatible ? if yes , suggest something to solve my problem as no where i am finding a solution that will solve this issue — CHETAN RAJPUT, Jan 23 '20 at 14:01
if output is True there is nothing wrong with your installation, check memory available in GPU (available memory and GPU utilization/usage is different). It seems that enough memory is not available to fit model and dataset. — Vivek Mehta, Jan 23 '20 at 14:09
I have 6gb of vram so i dont think that might be an issue as nothing else is running in background and the dataset i am training on isn't much big ( hardly 200mb ) and not having much layers in CNN... anyways , thanks for the help :) — CHETAN RAJPUT, Jan 23 '20 at 14:17
Does this answer your question? [Failed to get convolution algorithm. This is probably because cuDNN failed to initialize,](https://stackoverflow.com/questions/53698035/failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-in) — Nowhere Man, Aug 24 '20 at 11:48

score 4 · Accepted Answer · answered Jan 24 '20 at 07:03

4

Below code solved the issue :

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)

    except RuntimeError as e:
        print(e)

answered Jan 24 '20 at 07:03

CHETAN RAJPUT

131
2
14

1

Hey I'm just wondering why this solved the issue? thanks – mcstosh Jan 13 '21 at 11:13

score 1 · Answer 2 · answered Jan 23 '20 at 09:38

1

The error comes from the fact that there is an incompatibility between the:

CUDA version
CuDNN version
TensorFlow version

In the answer below I have provided working combinations of tensorflow, cuda and cudnn. Please have a look at the question which is similar to yours: Tensorflow 2.0 can't use GPU, something wrong in cuDNN? :Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

Eg. Cuda 10.0 + CuDNN 7.6.3 + / TensorFlow 1.13/1.14 / TensorFlow 2.0.

Eg2 Cuda 9 + CuDNN 7.0.5 + TensorFlow 1.10 works

answered Jan 23 '20 at 09:38

Timbus Calin

13,809
5
41
59

Thanks for the response but are my versions mismatched ? , cause I have executed the conda command and that itself downloaded rest of the dependencies so I think they should be compatible, but still can you verify once as I have already did tons of permutation on this. – CHETAN RAJPUT Jan 23 '20 at 13:33
Please use simple pip install, not conda. Install it from pip, not conda, and do it in a separate working environment in order not to pollute the global environment – Timbus Calin Jan 23 '20 at 13:49
Though I have tried pip as well but why pip over conda ? Okay lets do as u say , tell me the exact commands that will install tf, cuda n cudnn with so to avoid mismatch. – CHETAN RAJPUT Jan 23 '20 at 13:58

UnknownError: Failed to get convolution algorithm

2 Answers2

Linked