-2

Are there any popular PHP libraries or services that can help detect duplicate content?

I run a site that has user generated content and I want to detect content that are similar or duplicated. Are there any popular libraries out there that can help with this?

Andrew
  • 18,680
  • 13
  • 103
  • 118
Jo E.
  • 7,822
  • 14
  • 58
  • 94
  • 1
    Questions asking us to **recommend or find a tool, library or favorite off-site resource** are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, [describe the problem](http://meta.stackexchange.com/q/139399/) and what has been done so far to solve it. – Madara's Ghost Feb 17 '14 at 17:17

1 Answers1

2

Text similarity/plagiat/duplicate is a big topic. There are so many algos and solutions.

Some projects use the "adaptive local alignment of keywords" (you will find info on that on google.)

Also, you can check this (Check the 3 links in the answer, very instructive):

Cosine similarity vs Hamming distance

Hope this will help.

Community
  • 1
  • 1
Clément Andraud
  • 9,103
  • 25
  • 80
  • 158