It seems like the attention() method used to compute the attention mask in the seq2seq_model.py code in the example TensorFlow code for the sequence-to-sequence code is not called during decoding.
Does anyone know how to resolve this? A similar question was raised here: Visualizing attention activation in Tensorflow, but it's not clear to me how to get the matrix during decoding.
Thanks!