0

I am using Anaconda, tensorflow-gpu 2.0 GPU: NVIDIA RTX 2070 Python version 3.6.9 Cuda: cuda_10.0.130_411.31_win10 CuDnn: cudnn-10.0-windows10-x64-v7.6.5.32 using a laptop OS WINDOWS 10.

Hello, When I run this code,

EPOCHS = 5

for epoch in range(EPOCHS):
  for images, labels in train_ds:
    train_step(images, labels)

  for test_images, test_labels in test_ds:
    test_step(test_images, test_labels)

  template = 'epoch: {}, loss: {}, acc: {}, test loss: {}, test acc: {}'
  print (template.format(epoch+1,
                         train_loss.result(),
                         train_accuracy.result()*100,
                         test_loss.result(),
                         test_accuracy.result()*100))

it says as below:

---------------------------------------------------------------------------
UnknownError                              Traceback (most recent call last)
<ipython-input-21-fb8a7b9e2d15> in <module>
      3 for epoch in range(EPOCHS):
      4   for images, labels in train_ds:
----> 5     train_step(images, labels)
      6 
      7   for test_images, test_labels in test_ds:

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\def_function.py in __call__(self, *args, **kwds)
    455 
    456     tracing_count = self._get_tracing_count()
--> 457     result = self._call(*args, **kwds)
    458     if tracing_count == self._get_tracing_count():
    459       self._call_counter.called_without_tracing()

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\def_function.py in _call(self, *args, **kwds)
    485       # In this case we have created variables on the first call, so we run the
    486       # defunned version which is guaranteed to never create variables.
--> 487       return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
    488     elif self._stateful_fn is not None:
    489       # Release the lock early so that multiple threads can perform the call

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\function.py in __call__(self, *args, **kwargs)
   1821     """Calls a graph function specialized to the inputs."""
   1822     graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
-> 1823     return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
   1824 
   1825   @property

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\function.py in _filtered_call(self, args, kwargs)
   1139          if isinstance(t, (ops.Tensor,
   1140                            resource_variable_ops.BaseResourceVariable))),
-> 1141         self.captured_inputs)
   1142 
   1143   def _call_flat(self, args, captured_inputs, cancellation_manager=None):

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1222     if executing_eagerly:
   1223       flat_outputs = forward_function.call(
-> 1224           ctx, args, cancellation_manager=cancellation_manager)
   1225     else:
   1226       gradient_name = self._delayed_rewrite_functions.register()

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\function.py in call(self, ctx, args, cancellation_manager)
    509               inputs=args,
    510               attrs=("executor_type", executor_type, "config_proto", config),
--> 511               ctx=ctx)
    512         else:
    513           outputs = execute.execute_with_cancellation(

~\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     65     else:
     66       message = e.message
---> 67     six.raise_from(core._status_to_exception(e.code, message), None)
     68   except TypeError as e:
     69     keras_symbolic_tensors = [

~\Anaconda3\envs\DL\lib\site-packages\six.py in raise_from(value, from_value)

UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node my_model_1/conv2d_1/Conv2D (defined at C:\Users\hojun\Anaconda3\envs\DL\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]] [Op:__inference_train_step_1276]

Function call stack:
train_step

I tried to uninstall and install whole Nvidia GPU things (driver, cuda and cudnn with different versions and all) but nothing worked really. Tensorflow 2.0 2.1 all tried, but it is all same. Anyone has ideas to solve this issue?

Thank you.

talonmies
  • 70,661
  • 34
  • 192
  • 269
junmouse
  • 155
  • 1
  • 9
  • Here's a similar problem hope this [Solution](https://stackoverflow.com/questions/53698035/failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-in) solves your Issue. – M.Ali Mar 19 '20 at 16:08

1 Answers1

1

According to TF Site of the tested combination of Cuda & CuDnn you could try to install CuDnn 7.4 and see if that works out, hope it helps !

Windows :

M.Ali
  • 17
  • 2
  • I already tried all of those combos too. Sadly it has not worked yet.. :( – junmouse Mar 20 '20 at 01:25
  • Have you tried Downgrading to Cuda 9 & CuDnn 7 ? If that Doesn't work why don't you try Upgrading to Python 3.7, and see if that resolves something – M.Ali Mar 20 '20 at 05:01