1

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/layers/dense_attention.py says: "This class is suitable for Dense or CNN networks, and not for RNN networks." Anyone know why?

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Attention doesn't mention the above.

Thanks.

user4918159
  • 323
  • 2
  • 12
  • This is self-attention mechanism in Transformer models (which is computationally totally different from attention in CNN/RNNs). More info: https://arxiv.org/pdf/1706.03762.pdf – thushv89 Aug 24 '20 at 03:38
  • here a full explanation and demonstration: https://stackoverflow.com/a/61775631/10375049 – Marco Cerliani Aug 24 '20 at 07:15

0 Answers0