54

I'm working on a codebase that uses Spacy. I installed spacy using:

sudo pip3 install spacy

and then

sudo python3 -m spacy download en

At the end of this last command, I got a message:

    Linking successful
/home/rayabhik/.local/lib/python3.5/site-packages/en_core_web_sm -->
/home/rayabhik/.local/lib/python3.5/site-packages/spacy/data/en

You can now load the model via spacy.load('en')

Now, when I try running my code, on the line:

    from spacy.en import English

it gives me the following error:

ImportError: No module named 'spacy.en'

I've looked on Stackexchange and the closest is: Import error with spacy: "No module named en" which does not solve my problem.

Any help would be appreciated. Thanks.

Edit: I might have solved this by doing the following:

 Python 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> spacy.load('en')
<spacy.lang.en.English object at 0x7ff414e1e0b8>

and then using:

from spacy.lang.en import English

I'm still keeping this open in case there are any other answers.

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
rayabhik
  • 697
  • 1
  • 5
  • 9

14 Answers14

64

Yes, I can confirm that your solution is correct. The version of spaCy you downloaded from pip is v2.0, which includes a lot of new features, but also a few changes to the API. One of them is that all language data has been moved to a submodule spacy.lang to keep thing cleaner and better organised. So instead of using spacy.en, you now import from spacy.lang.en.

- from spacy.en import English
+ from spacy.lang.en import English

However, it's also worth mentioning that what you download when you run spacy download en is not the same as spacy.lang.en. The language data shipped with spaCy includes the static data like tokenization rules, stop words or lemmatization tables. The en package that you can download is a shortcut for the statistical model en_core_web_sm. It includes the language data, as well as binary weight to enable spaCy to make predictions for part-of-speech tags, dependencies and named entities.

Instead of just downloading en, I'd actually recommend using the full model name, which makes it much more obvious what's going on:

python -m spacy download en_core_web_sm
nlp = spacy.load("en_core_web_sm")

When you call spacy.load, spaCy does the following:

  1. Find the installed model named "en_core_web_sm" (a package or shortcut link).
  2. Read its meta.json and check which language it's using (in this case, spacy.lang.en), and how its processing pipeline should look (in this case, tagger, parser and ner).
  3. Initialise the language class and add the pipeline to it.
  4. Load in the binary weights from the model data so pipeline components (like the tagger, parser or entity recognizer) can make predictions.

See this section in the docs for more details.

Ines Montani
  • 6,935
  • 3
  • 38
  • 53
  • 1
    Thanks a bunch Ines. Yes, I did see that there was no need to import English. This is code written by someone else, and I don't have the time to clean it up right now, but can hopefully later. – rayabhik Nov 15 '17 at 02:35
  • 4
    I've spacy '2.0.12' and above does not work. @gdaras solution of `from spacy.lang.en import English` does work. – Anil_M Sep 10 '18 at 21:09
  • @Anil_M you should make this an answer! It works while the accepted answer doesn't for later versions of 2.0.x – duhaime Apr 25 '19 at 10:43
  • 1
    I've updated my reply to include more info on the model packages to prevent the confusion! – Ines Montani Apr 26 '19 at 11:58
  • Even after i did **python -m spacy download en_core_web_sm**, Jupyter **throws error** when i try to execute this code **nlp = spacy.load("en_core_web_sm")**. The **error** is **"[E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory."** . – fuat May 10 '20 at 09:24
  • @InesMontani does spacy 3.0 has wordnet? It seems that wordnet only supports 'en' model. – the_learning_child Apr 30 '21 at 18:19
15

I used the following command for installing spacy from anaconda distribution.

conda install -c conda-forge spacy

and after that, I was able to download English using the following command without any error.

 python -m spacy download en
anees ahmed
  • 151
  • 1
  • 3
9

I had to use en_core_web_sm instead of en to make that work. It is complaining about permission problem. The following works perfectly:

import spacy
spacy.load('en_core_web_sm')
from spacy.lang.en import English
Elham
  • 163
  • 2
  • 5
4

I think that there is a confusion in the answers provided. Correct things mentioned:

  • you should import from spacy.lang.en
  • spacy.load('en') is indeed a shortcut for loading models.

But: the file en_core_web_sm is not the same file as the one you import from spacy.lang.en. Actually, the first file is produced from the second after training with spacy train in a dataset and then packaging the result. spacy.lang.en contains the model definition: lemmas lookup table, stop_words, lexical attributes (and more). But that and only that. It is not trained with a dataset so that the dependency graph and other functionalities can work.

I think this should be clear enough when working with spaCy.

gdaras
  • 9,401
  • 2
  • 23
  • 39
3
pip install spacy
python -m spacy download en

This works for me

User
  • 1,460
  • 14
  • 11
2

Anaconda Users

  1. If you're using a conda virtual environment, be sure that its the same version of Python as that in your base environment. To verify this, run python --version in each environment. If not the same, create a new virtual environment with that version of Python (Ex. conda create --name myenv python=x.x.x).

  2. Activate the virtual environment (conda activate myenv)

  3. conda install -c conda-forge spacy
  4. python -m spacy download en_core_web_sm

I just ran into this issue, and the above worked for me. This addresses the issue of the download occurring in an area that is not accessible to your current virtual environment.

You should then be able to run the following:

import spacy
nlp = spacy.load("en_core_web_sm")
Colonel_Old
  • 852
  • 9
  • 15
1

the en_core_web_sm folder was downloaded outside the spacy folder. I copied it into the spacy/data folder and could run the code as documented in spacy

Sujay DSa
  • 1,172
  • 2
  • 22
  • 37
1

Anyone facing this issue on Windows 10 and Anaconda installation , look for your conda python executable using where python on command line before running the script.

In my case, the python on the PATH was

C:\Users\XXX\.windows-build-tools\python27\python.exe

whereas what I needed was from

c:\Users\XXX\AppData\Local\Continuum\anaconda3\python.exe

Just add the correct python on the path , or go to this location and run

python -m spacy download en

and it should work.

fatcook
  • 946
  • 4
  • 16
1

According to the official website you should do as follow:

python -m spacy download en

However, this surprisingly does NOT work for me.
As you may interst, my env is based on OSX 10.15 with python 3.8, pip 19.3.1
try:

spacy download en
QIAN KEQIAO
  • 541
  • 1
  • 4
  • 7
1

For me what worked are these steps:

import sys
!{sys.executable} -m pip install spacy
!{sys.executable} -m spacy download en

I run these steps at my spyder console (installed through anaconda)

msh855
  • 1,493
  • 1
  • 15
  • 36
1

For me the below steps worked in my Jupyter:

pip install spacy


import spacy

from spacy.cli import download

print(download('en_core_web_sm'))
Ihor Konovalenko
  • 1,298
  • 2
  • 16
  • 21
  • Hi and thanks for the answer. Could you explain what your code does and how does it build on the previous 12 answers to the question? – Simas Joneliunas Feb 12 '22 at 00:13
1

I solved it by:

import spacy.cli
spacy.cli.download("en_core_web_lg")

Hope it helps

xackobo
  • 261
  • 2
  • 4
0

from spacy.lang.en import English instead of from spacy.en import English

0

my solutions for en and fr and it

!pip install spacy
!python -m spacy download en
!python -m spacy download it
!python -m spacy download fr

then

import spacy
spacy.load('en')
spacy.load('it')
spacy.load('fr')
venergiac
  • 7,469
  • 2
  • 48
  • 70