Syntactic similarity/distance between 2 sentences/string/text using nltk

Question

I have 2 texts as below

Text1 : John likes apple

Text2 : Mike hates orange

If you check above 2 texts, both of them are similar syntactically but semantically have a different meaning.

I want to find

1) Syntactic distance between 2 texts

2) Semantic distance between 2 texts

Is there any way to do this using nltk, as I am newbie to NLP?

score 4 · Accepted Answer · answered Aug 16 '16 at 14:25

4

Yes, But not limited to nltk. One way that use for syntactic distance, is Part Of Speech tagging(POS Tagging) that map each word of sentence to a specific tag: https://en.wikipedia.org/wiki/Part-of-speech_tagging

For example it map your sentences to these:
Text1: Noun Verb Noun
Text2: Noun Verb Noun

Then you can measure the distance of these two sentences.

And for semantic, you need semantic word net and find synonyms for each word of the sentence, then try to find the intersection of synonyms of words in each sentence

answered Aug 16 '16 at 14:25

Masoud

1,343
8
25

This is a good answer. Perhaps you could recommend OP methods of comparison for the 1st case, and a particular word net or resource? I'm sure future readers will be interested too – salezica Aug 17 '16 at 00:16
Thanx @Masoud for providing the direction, just have a couple of questions, Do we have any built-in library which calculates the SYNTACTIC distance in nltk? If not then how to measure the distance for the same? any reference/resource you can provide? – Ganesh Deshvini Aug 17 '16 at 09:04

aerin · Answer 2 · 2016-08-17T00:11:17.050

3

For the semantic, you might want to try word2vec. You can safely average the similarity of words within the sentence or you can come up with your own way to weigh the words according to its syntax.

from gensim.models import Word2Vec

model = Word2Vec.load(path/to/your/model)

model.similarity('apple', 'orange')

edited Aug 17 '16 at 00:11

answered Aug 16 '16 at 23:59

aerin

20,607
28
102
140

any reference you can provide for syntatic distance? Do we have any built-in library support? – Ganesh Deshvini Aug 22 '16 at 13:47

Syntactic similarity/distance between 2 sentences/string/text using nltk

2 Answers2