i have list of words like amazing, interesting, love, great, nice. And i want to check if word is adjective or verb , like "love" is verb and nice is adjective... How to do it using python, or nltk, any help ?
Asked
Active
Viewed 2.5k times
11
-
Hmm..I don't think words have to be mutually exclusive like this. Like "to love" is the infinitive, but you can love something (verb), or be in love (now it's an adverb), or have a love bracelet or love affair (now it's an adjective) – MathBio Feb 17 '16 at 16:51
-
Without context, POS of most non-noun words are not conclusive. – alvas Feb 17 '16 at 18:47
-
1Without context, the closest you can get is to use the 1st POS from WordNet `from nltk import wordnet as wn; wn.synsets('amazing')[0].pos()` or `import nltk; nltk.pos_tag(['amazing'])`. But as said in the previous comments, the outputs will not be conclusive. – alvas Feb 17 '16 at 18:48
2 Answers
16
The only way to guess what a word is without having any context is to use WordNet, but it won't be 100% reliable since for example "love" can have different roles in a sentence.
from nltk.corpus import wordnet as wn
words = ['amazing', 'interesting', 'love', 'great', 'nice']
for w in words:
tmp = wn.synsets(w)[0].pos()
print w, ":", tmp
Will output:
amazing : v
interesting : v
love : n
great : n
nice : n

Alex
- 6,849
- 6
- 19
- 36
-
1Also since the question tagged `parsing`, I am assuming there might be some cases where the token is not a word at all (just had this issue myself). In that case, make sure you check the output of `wn.synsets(w)` before you try to index into the list. – Jack Ryan Nov 20 '16 at 22:56
-
-
I think it's ADJECTIVE SATELLITE (https://wordnet.princeton.edu/documentation/wndb5wn) – Nathan B Sep 24 '18 at 08:28
-
Definitely some false positives here, too - 'interesting' is not a verb, 'run' is a verb, yet appears as a noun. – brandonscript Apr 27 '21 at 20:36
-
@brandonscript "He is interesting me to start my own business." Sure a queer way of phrasing it, but "to interest someone" is definitely also a verb. – Daniël van den Berg May 26 '21 at 17:21
3
An update to @Alex solution:
- To only include synsets that belong to word w (not the first synset)
- To list all pos tags that the word w gets
Code:
from nltk.corpus import wordnet as wn
words = ['amazing', 'interesting', 'love', 'great', 'nice']
pos_all = dict()
for w in words:
pos_l = set()
for tmp in wn.synsets(w):
if tmp.name().split('.')[0] == w:
pos_l.add(tmp.pos())
pos_all[w] = pos_l
print pos_all
Output:
{'interesting': set([u'a']),
'amazing': set([u's']),
'love': set([u'v', u'n']),
'great': set([u's', u'n']),
'nice': set([u'a', u's', u'n'])}