1

I am hoping to find a downloadable (free or paid) English dictionary preferably from Oxford, Cambridge, Webster in text or XML format to do some NLP.

I hope that each entry has

  • a full part of speech,
  • pronunciation,
  • morphology in case of verb and noun
  • multiple sense/definition entries

such as in the following page http://www.merriam-webster.com/dictionary/side.

The actual text of the definition is not important. What I need most is the part of speech, pronunciation, morphology, order of definition entries.

Also wondering: what does the Stanford NLP toolkit use as lexical resources when it does POS tagging?

Thank you.

InformedA
  • 179
  • 1
  • 9

1 Answers1

2

Here and here are the similar questions. In summary:

  1. Part-of speech dictionary - unfortunately, with quite narrow tag set.
  2. Pronouncing Dictionary
  3. Multiple senses - WordNet

Morphological dictionary can be found in FreeLing distribution - see data/en/dicc.src. Btw, there are also senses and phonetic dictionaries.

About Stanford POS tagger: they use Penn treebank, proof

Community
  • 1
  • 1
Nikita Astrakhantsev
  • 4,701
  • 1
  • 15
  • 26
  • Thanks a lot! [1] seems to need some proof reading. I didn't know that Stanford tagger uses POS lexical resources not built by dictionary. Nevertheless, this is great. – InformedA Feb 22 '15 at 16:00
  • cool! I don't find the file data/en/dicc.src in FreeLing. has it moved? – David Portabella May 24 '17 at 09:38