1

What is the best way to build a recurrent language model (e.g. LSTM) that does not cross sentence boundaries? Or put more general, if you present a batch to the model, each row containing multiple sentences, how can you reset the state after seeing each sentence? Is there a special token you can specify to the model?

Thanks!

niefpaarschoenen
  • 560
  • 1
  • 8
  • 19

1 Answers1

0

If the sentences are independent, it would be cleaner to let each row in the batch contain only one sentence. Then, you can reset the LSTM's state after each batch, like explained in the answers to this question.

Community
  • 1
  • 1
Kilian Obermeier
  • 6,678
  • 4
  • 38
  • 50