Colab: UnknownError: Failed to get convolution algorithm when using TF 2.5

Question

I seem to be having problems when using tensorflow 2.5 on Google Colab. I assume there is some incompatibility between the CUDA version and/or CuDNN version. How would I fix them?

I checked the CUDA version used by colab. It is 11.2 which should be ok with tf2.5. That would mean that the problem is with CuDNN, right?

Code to reproduce:

!pip install tensorflow==2.5.0
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

def my_model():
    inputs = keras.Input(shape=(32, 32, 3))
    x = layers.Conv2D(32, 3)(inputs)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(64, 3)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(128, 3)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation="relu")(x)
    outputs = layers.Dense(10)(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model


model = my_model()
model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=keras.optimizers.Adam(learning_rate=3e-4),
    metrics=["accuracy"],
)

model.fit(x_train, y_train, batch_size=64, epochs=10, verbose=2)
model.evaluate(x_test, y_test, batch_size=64, verbose=2)

Error I get

I have tried this answer but I get the same error.

This answer also proposes I use tf.config.experimental.set_memory_growth(gpu, True) but again - that does not work - I get the same error.

I am interested in using GPU. I know that everything works fine without hardware acceleration.

Set that hardware accelerator to `None`. This will disable GPU in colab and your code will run fine. — Adarsh Wase, Sep 21 '21 at 10:41
Yes, but I want to run the code using a GPU. This code is only for reproducibility, but I need to use a GPU in the general case. — Jim, Sep 21 '21 at 10:43
Google said, do not install any TensorFlow version by `!pip install` on their [website](https://colab.research.google.com/notebooks/tensorflow_version.ipynb#scrollTo=8UvRkm1JGUrk). So, if you want to use GPU, then use it with TensorFlow 2.6. — Adarsh Wase, Sep 21 '21 at 11:06
@AdarshWase Thank you for the information. I guess I would have to update my code. If you want post your comment as an answer, so I can mark it. — Jim, Sep 21 '21 at 14:58
i have TF 2,7, !nvcc--ver says CUDA 11,1 and !!nvidia-smi says CUDA 11.2, Isn;t this a disconnect? — Nguai al, Dec 10 '21 at 18:39

score 0 · Accepted Answer · answered Sep 21 '21 at 15:23

In this documentation, Google warns us not to install/downgrade TensorFlow version using !pip command.
They wrote:

Colab builds TensorFlow from source to ensure compatibility with our fleet of accelerators. Versions of TensorFlow fetched from PyPI by pip may suffer from performance problems or may not work at all.

Which means if we install any other TensorFlow, that version might not be compatible with their provided GPU/TPU configuration. So, just use TensorFlow 2.6 (which is the latest version) and it is so much similar to version 2.5.

ok. good to know. what happens if you get same error using the default tensorflow.. — Nguai al, Dec 10 '21 at 09:34

Colab: UnknownError: Failed to get convolution algorithm when using TF 2.5

1 Answers1