1

I am (very) new to the field of NLP, I tried to look for an API (in Java) that can tell me if two pieces of text have the same meaning (or if one is derived by the other) for example:

"billy said tom was a nice kid"

is the same as

"tom is a nice kid according to billy"

I checked GATE and openNlp and it seems like GATE only offers API for annotations and openNlp doesnt support it as well.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
omriliba
  • 11
  • 1

2 Answers2

3

Omri, no existing piece of software, in Java or another programming language, can tell you this. Text understanding is the holy grail of natural language processing.

I suggest you start by doing smaller tasks, and gradually approach this vast task. Please see this question and the answers.com page on nlp for some pointers. Textual Entailment, an active research area, may be close to what you are asking about.

Community
  • 1
  • 1
Yuval F
  • 20,565
  • 5
  • 44
  • 69
  • 1
    oops i was going to write some more :) – omriliba Feb 16 '11 at 07:31
  • anyway.. i was looking at this site:http://www.ultralingua.com/ul/en/semantic-search.htm and it seems like they claim that they can compare to pieces of text for similarity is it wrong? – omriliba Feb 16 '11 at 07:33
  • Hmm. You described a very exact example. I believe the state of the art in semantic search has not reached what you asked for. There are many attempts to define similarity between texts, but most are shallow (do not use deep structure) or domain-specific (e.g. only texts about basketball) or both, because the full problem is really hard. If you want to, you can email me at fyuval at gmail dot com for a more focused discussion. – Yuval F Feb 16 '11 at 08:28
0

You can try the Retina API from Cortical.io: it measures the semantic similarity of any two texts using several distance measures (Cosine Similarity, Jacquard Distance, Euclidian Distance...). You can even get a visual representation of the semantic overlap.