I need to create a table, containing relations between words (synsets) from any raw text using path_similarity method.
>>> from nltk.corpus import wordnet as wn
>>> sent = "I went to the bank to deposit money".split()
>>> wn.synsets('bank')
[Synset('bank.n.01'), Synset('depository_financial_institution.n.01'), Synset('bank.n.03'), Synset('bank.n.04'), Synset('bank.n.05'), Synset('bank.n.06'), Synset('bank.n.07'), Synset('savings_bank.n.02'), Synset('bank.n.09'), Synset('bank.n.10'), Synset('bank.v.01'), Synset('bank.v.02'), Synset('bank.v.03'), Synset('bank.v.04'), Synset('bank.v.05'), Synset('deposit.v.02'), Synset('bank.v.07'), Synset('trust.v.01')]
How can I get the correct synset for each word from the raw text?
I can get the lemmas and POS tags as such:
>>> from nltk import pos_tag
>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> wnl.lemmatize('banks')
u'bank'
>>> pos_tag(['banks'])
[('banks', 'NNS')]
But how do I get the correct synset/sense number?