I am looking for an algorithm that tries to check
1) the similarity of sentences (around 5000) with each other in a document
2) the similarity of multiple documents (around 5000) with respect to each other
I need the same because I'm trying to evaluate whether the text documents/ sentences coming under a particular category are in any manner similar to each other . Are there any existing methods for doing the same.