4

I am developing a Python program in order to find the etymology of words in a text. I have found out there are basically two options: parsing an online dictionary that provides etymology or using an API. I found this reply here but I don't seem to understand how to link the Oxford API with my Python program.

Can anyone explain me how to look up a word in an english dictionary? Thank you in advance.

Link to the question here

Note that while WordNet does not have all English words, what about the Oxford English Dictionary? (http://developer.oxforddictionaries.com/). Depending on the scope of your project, it could be a killer API. Have you tried looking at Grady Ward's Moby? [link] (http://icon.shef.ac.uk/Moby/). You could add it as a lexicon in NLTK (see notes on "Loading your own corpus" in Section 2.1).

from nltk.corpus import PlaintextCorpusReader
corpus_root = '/usr/share/dict'
wordlists = PlaintextCorpusReader(corpus_root, '.*')

from nltk.corpus import BracketParseCorpusReader
corpus_root = r"C:\corpora\penntreebank\parsed\mrg\wsj"
file_pattern = r".*/wsj_.*\.mrg"
ptb = BracketParseCorpusReader(corpus_root, file_pattern)
Núria Bosch
  • 37
  • 1
  • 4
  • This is difficult, there is no single source of thruth regarding word definition / etymology / disambiguation, even for english. I think there is a tool for the wiktionary that provides etymology, but I can't find it right now. – amirouche Mar 23 '18 at 19:44

2 Answers2

3

You could use the opensource ety package. Disclosure: I'm a contributor to the project

It's based on the data used in the research "Etymological Wordnet: Tracing the History of Words", which has already been pre-scraped from Wiktionary.

Some examples:

>>> import ety

>>> ety.origins("potato")
[Word(batata, language=Taino)]

>>> ety.origins('drink', recursive=True)
[Word(drync, language=Old English (ca. 450-1100)),
 Word(drinken, language=Middle English (1100-1500)),
 Word(drincan, language=Old English (ca. 450-1100))]

>>> print(ety.tree('aerodynamically'))
aerodynamically (English)
├── -ally (English)
└── aerodynamic (English)
    ├── aero- (English)
    │   └── ἀήρ (Ancient Greek (to 1453))
    └── dynamic (English)
        └── dynamique (French)
            └── δυναμικός (Ancient Greek (to 1453))
                └── δύναμις (Ancient Greek (to 1453))
                    └── δύναμαι (Ancient Greek (to 1453))
alxwrd
  • 2,320
  • 16
  • 28
1

Using PyDictionary May be a Good Option

Tom
  • 685
  • 8
  • 17