0

How to find the semantic similarity between any two given sentences?

Eg: what movies did ron howard direct?

movies directed by ron howard.

I know its a hard problem. But, would like to ask the views of experts. I don't know how to use the Parts of Speech to achieve this. http://nlp.stanford.edu:8080/parser/index.jsp

nizam.sp
  • 4,002
  • 5
  • 39
  • 63
  • 1
    Similar on what level? "Movies not directed my Ron Howard" is lexically very similar to your second sentence but semantically its diametrical opposite. Voting to close as too broad. – tripleee Nov 25 '14 at 06:40
  • I don't see the purpose of your example. Are you working on a QA system? If yes you should state it, and make explicit what is your input and what is expected as output. – Pierre Nov 25 '14 at 10:57
  • possible duplicate of [How to calculate cosine similarity given 2 sentence strings? - Python](http://stackoverflow.com/questions/15173225/how-to-calculate-cosine-similarity-given-2-sentence-strings-python) – alvas Nov 25 '14 at 14:30

1 Answers1

0

Its a broad problem. I would personally go for cosine similarity.

You need to convert your sentences into a vector. For converting the sentence into vector you can consider several rules, like number of occurances, order, synonyms etc. Then taking the cosine distance as mentioned here

You can also explore elasticsearch for finding associated words. You can create your custom analyzers, stemmer, tokenizer, filters(like synonyms) etc. which can be very helpful in finding similar sentences. Elasticsearch also provides more like this query which finds similar documents using the tf-idf scores.

Community
  • 1
  • 1
pratZ
  • 3,078
  • 2
  • 20
  • 29