0

I seem to be having problems when using tensorflow 2.5 on Google Colab. I assume there is some incompatibility between the CUDA version and/or CuDNN version. How would I fix them?

I checked the CUDA version used by colab. It is 11.2 which should be ok with tf2.5. That would mean that the problem is with CuDNN, right?

Code to reproduce:

!pip install tensorflow==2.5.0
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

def my_model():
    inputs = keras.Input(shape=(32, 32, 3))
    x = layers.Conv2D(32, 3)(inputs)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(64, 3)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(128, 3)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation="relu")(x)
    outputs = layers.Dense(10)(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model


model = my_model()
model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=keras.optimizers.Adam(learning_rate=3e-4),
    metrics=["accuracy"],
)

model.fit(x_train, y_train, batch_size=64, epochs=10, verbose=2)
model.evaluate(x_test, y_test, batch_size=64, verbose=2)

Error I get

I have tried this answer but I get the same error.

This answer also proposes I use tf.config.experimental.set_memory_growth(gpu, True) but again - that does not work - I get the same error.

I am interested in using GPU. I know that everything works fine without hardware acceleration.

E_net4
  • 27,810
  • 13
  • 101
  • 139
Jim
  • 111
  • 3
  • 13
  • What's your cuDNN version? – Adarsh Wase Sep 21 '21 at 10:26
  • How do I check the cuDNN on Google Colab? – Jim Sep 21 '21 at 10:27
  • Set that hardware accelerator to `None`. This will disable GPU in colab and your code will run fine. – Adarsh Wase Sep 21 '21 at 10:41
  • Yes, but I want to run the code using a GPU. This code is only for reproducibility, but I need to use a GPU in the general case. – Jim Sep 21 '21 at 10:43
  • 2
    Google said, do not install any TensorFlow version by `!pip install` on their [website](https://colab.research.google.com/notebooks/tensorflow_version.ipynb#scrollTo=8UvRkm1JGUrk). So, if you want to use GPU, then use it with TensorFlow 2.6. – Adarsh Wase Sep 21 '21 at 11:06
  • @AdarshWase Thank you for the information. I guess I would have to update my code. If you want post your comment as an answer, so I can mark it. – Jim Sep 21 '21 at 14:58
  • i have TF 2,7, !nvcc--ver says CUDA 11,1 and !!nvidia-smi says CUDA 11.2, Isn;t this a disconnect? – Nguai al Dec 10 '21 at 18:39

1 Answers1

0

In this documentation, Google warns us not to install/downgrade TensorFlow version using !pip command.
They wrote:

Colab builds TensorFlow from source to ensure compatibility with our fleet of accelerators. Versions of TensorFlow fetched from PyPI by pip may suffer from performance problems or may not work at all.

Which means if we install any other TensorFlow, that version might not be compatible with their provided GPU/TPU configuration. So, just use TensorFlow 2.6 (which is the latest version) and it is so much similar to version 2.5.

Adarsh Wase
  • 1,727
  • 3
  • 12
  • 26