0

I made a list of synonyms of the word 'good' and I even told the program not to append a word, if it is already in the list. Unfortunately, I still have dublicates. This is my code:

import nltk
from nltk.corpus import wordnet
synonyms = []
for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        if str(l) not in synonyms:
            synonyms.append(l.name())
print(synonyms)

And the output is the following

['good', 'good', 'goodness', 'good', 'goodness', 'commodity', 'trade_good', 
'good', 'good', 'full', 'good', 'good', 'estimable', 'good', 'honorable', 
'respectable', 'beneficial', 'good', 'good', 'good', 'just', 'upright', 
'adept', 'expert', 'good', 'practiced', 'proficient', 'skillful', 'skilful',
 'good', 'dear', 'good', 'near', 'dependable', 'good', 'safe', 'secure', 
'good', 'right', 'ripe', 'good', 'well', 'effective', 'good', 'in_effect', 
'in_force', 'good', 'good', 'serious', 'good', 'sound', 'good', 'salutary', 
'good', 'honest', 'good', 'undecomposed', 'unspoiled', 'unspoilt', 'good', 
'well', 'good', 'thoroughly', 'soundly', 'good']

Does somebody know why this is happening?

maybeyourneighour
  • 494
  • 2
  • 4
  • 13
  • 2
    You're testing whether `str(l)` is not in your synonyms, but then you're appending `l.name()` could it be the case that `str(l)!=l.name()`? Why not test whether l.name() is in your synonym list? – Thomas Kimber Aug 09 '19 at 10:58

4 Answers4

4

You can use a set object to prevent duplicates.

Ex:

import nltk
from nltk.corpus import wordnet
synonyms = set()
for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        synonyms.add(l.name())

print(synonyms)  #If you need it as a list print(list(synonyms))
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • 1
    Since there is no need for any checks, you might as well go for the comprehension version `synonyms = {l.name() for syn in wordnet.synsets("good") for l in syn.lemmas()}` – Ma0 Aug 09 '19 at 11:00
  • 1
    Although this is an alternative solution, this does not answer OP's question of why is he getting duplicates in the list in the event that it already exists in the list. – Sushant Aug 09 '19 at 11:08
  • this doesn't preserve the order of the list. Doesn't answer OP question (which is a classic duplicate) – Jean-François Fabre Aug 09 '19 at 11:22
0

Your test is on l rather than l.name(), even though it is what you want in your list. Instead, use :

if l.name() not in synonyms:
            synonyms.append(l.name())
Charles
  • 324
  • 1
  • 13
0

I think it's because the code is using str(l) to look for duplicates but then storing l.name().

The following should work

import nltk
from nltk.corpus import wordnet
synonyms = []
for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        if l.name() not in synonyms:
            synonyms.append(l.name())
print(synonyms)
Steve
  • 2,205
  • 1
  • 21
  • 28
-1

variable l might have some unique id attached to it.

you should try:

if str(l.name()) not in synonyms: