Hello i'm looking to find a solution of my issue : I Want to find a list of similar words with french and english For example : name could be : first name, last name, nom, prénom, username.... Postal address could be : city, country, street, ville, pays, code postale ....
-
Use `nltk` library – bigbounty Feb 26 '19 at 10:02
-
you can use the `PyDictionary import PyDictionary` library to get the English synonyms. – Rahil Hastu Feb 26 '19 at 10:03
-
@Youness refer to : https://stackoverflow.com/questions/19258652/how-to-get-synonyms-from-nltk-wordnet-python – Abdulla Thanseeh Feb 26 '19 at 10:10
2 Answers
from PyDictionary import PyDictionary
dictionary=PyDictionary()
answer = dictionary.synonym(word)
word
is the word for which you are finding the synonyms.

- 558
- 2
- 13
The other answer, and comments, describe how to get synonyms, but I think you want more than that?
I can suggest two broad approaches: WordNet and word embeddings.
Using nltk and wordnet, you want to explore the adjacent graph nodes. See http://www.nltk.org/howto/wordnet.html for an overview of the functions available. I'd suggest that once you've found your start word in Wordnet, follow all its relations, but also go up to the hypernym, and do the same there.
Finding the start word is not always easy: http://wordnetweb.princeton.edu/perl/webwn?s=Postal+address&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=
Instead it seems I have to use "address": http://wordnetweb.princeton.edu/perl/webwn?s=address&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h= and then decide which of those is the correct sense here. Then try clicking the hypernym, hyponym, sister term, etc. To be honest, none of those feels quite right.
Open Multilingual WordNet tries to link different languages. http://compling.hss.ntu.edu.sg/omw/ So you could take your English WordNet code, and move to the French WordNet with it, or vice versa.
The other approach is to use word embeddings. You find the, say, 300 dimensional, vector of your source word, and then hunt for the nearest words in that vector space. This will be returning words that are used in similar contexts, so they could be similar meaning, or similar syntactically.
Spacy has a good implementation, see https://spacy.io/usage/spacy-101#vectors-similarity and https://spacy.io/usage/vectors-similarity
Regarding English and French, normally you would work in the two languages independently. But if you search for "multilingual word embeddings" you will find some papers and projects where the vector stays the same for the same concept in different languages.
Note: the API is geared towards telling you how two words are similar, not finding similar words. To find similar words you need to take your vector and compare with every other word vector, which is O(N) in the size of the vocabulary. So you might want to do this offline, and build your own "synonyms-and-similar" dictionary for each word of interest.

- 27,837
- 13
- 117
- 217
-
+1 thanks you a lot, that exactly what i want to do, i will try to strat by finding the start word and i will use also Spacy to check the similarity – Youness Drissi Slimani Feb 28 '19 at 14:24
-
I m trying to find related and similar words to restaurant, however wordnet gave me the lemmas for eating place, eatery and eating house. Could you describe a bit more on the process of following the relations and doing the same for hypernym? – YHStan Sep 07 '21 at 05:02
-
1@YHStan Concretely, "following the relations" would mean clicking the "S" character next to an entry in WordNet search results, then clicking one of the links, e.g. "direct hypernym", that appears. It *can* be automated using nltk, but you often need human judgement for it to be useful. – Darren Cook Sep 07 '21 at 06:48