I'm looking to generate an algorithm that can determine the similarity of a series of sentences. Specifically, given a starter sentence, I want to determine if the following sentence is a suitable addition.
For example, take the following:
My dog loves to drink water.
All is good, this is just the first sentence.
The dog hates cats.
All is good, both sentences reference dogs.
It enjoys walks on the beach.
All is good, "it" is neutral enough to be an appropriate communication.
Pizza is great with pineapple on top.
This would not be a suitable addition, as the sentence does not build on to the "narrative" created by the first three sentences.
To outline the project a bit, I've created a library that generated Markov text chains based on the input text. That text is then corrected grammatically to produce viable sentences. I now want to string these sentences together to create coherent paragraphs.