I store all articles from some news sources. A news article that originates from e.g. Cnn.com, might be reposted by others. In effect I end up saving the same articles many times.
If I do a search for 'Tesla' I might get 3 articles that are 90% equal to each other. I can compare and filter duplicates in my app using the Levenshtein distance, but I rather have ES filtering it.
Is there a way I can say give me all articles matching WORD, but only return the first if other hits are more than 90% equal to the first
?
Cheers, Martin