18

I need all the words from Spacy vocab. Suppose, I initialize my spacy model as

nlp = spacy.load('en')

How do I get the text of words from nlp.vocab?

petezurich
  • 9,280
  • 9
  • 43
  • 57
pauli
  • 4,191
  • 2
  • 25
  • 41

2 Answers2

34

You can get it as a list like this:

list(nlp.vocab.strings)
David
  • 755
  • 5
  • 11
7

As of spaCy v3.0, we need to run

python -m spacy download en_core_web_sm

and then e.g.

import spacy
nlp = spacy.load("en_core_web_sm")
words = set(nlp.vocab.strings)
word = 'would'
print(f"Is '{word}' an English word: {word in words}")  # True
tyrex
  • 8,208
  • 12
  • 43
  • 50