I'm trying to run a TensorFlow model, for the first time, on a NVIDIA Titan RTX, but I'm getting some errors.
CUDA version
$ cat /usr/local/cuda/version.json
{
"cuda" : {
"name" : "CUDA SDK",
"version" : "11.3.20210326"
},
...
python3.9.1 and tensorflow2.5.0-rc1
Traceback (most recent call last):
File "/home/marcus/COVID-19-forecasting/COVID-19/run_experiments.py", line 23, in <module>
exp.run_experiments(dat.horizon, dat.pad_val, dat.padded_scaled_train, dat.multi_out_scaled_val, dat.padded_scaled_test_x,
File "/home/marcus/COVID-19-forecasting/COVID-19/experiment.py", line 110, in run_experiments
lstm_hist = lstm.fit([tr, enc_names], [v[0], v[1], v[2]], self.epochs, verbose=0)
File "/home/marcus/COVID-19-forecasting/COVID-19/model.py", line 55, in fit
return self.model.fit(x=x, y=y, epochs=epochs, callbacks=callbacks, verbose=verbose)
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/keras/engine/training.py", line 1183, in fit
tmp_logs = self.train_function(iterator)
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py", line 889, in __call__
result = self._call(*args, **kwds)
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py", line 950, in _call
return self._stateless_fn(*args, **kwds)
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 3023, in __call__
return graph_function._call_flat(
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 1960, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 591, in call
outputs = execute.execute(
File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation.
[[{{node cond_40/then/_0/cond/CudnnRNNV3}}]]
[[multi_output_rnn/encoder_block/rnn_encoder/PartitionedCall]] [Op:__inference_train_function_6309]
Function call stack:
train_function -> train_function -> train_function
I tried adding these lines to my code but nothing changed.
physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)