Implemenet attention in vanilla encoder-decoder architecture

Question

I have tried a vanila enc-dec arch as following (english to french NMT)

I want to know how to integrate keras attention layer here. Either from the keras docs or any other attention module from third party repo is also welcome. I just need to integrate it and see how it works and finetune it.

Full code is available here.

Not showing any code in this post because it's large and complex.

That's what I want to know actually, from the docs I know you gotta generate the attention weights from decoder and encoder inputs. I want something by doing changes in this existing architecture. this is a french to english translation architecture. — Sayan Dey, Aug 30 '20 at 07:51
@AniketBote I have implemented what I was looking for, have put an answer. — Sayan Dey, Aug 31 '20 at 14:25

score 1 · Accepted Answer · answered Aug 31 '20 at 13:47

1

Finally I have resolved the issue. I am using a third-party-attention layer by Thushan Ganegedara. Used it's Attentionlayer class. And integrated that in my architecture as following.

answered Aug 31 '20 at 13:47

Sayan Dey

771
6
13

Implemenet attention in vanilla encoder-decoder architecture

1 Answers1