31

I am using spacy in google colab to build an NER model for which I have downloaded the spaCy 'en_core_web_lg' model using

import spacy.cli
spacy.cli.download("en_core_web_lg")

and I get a message saying

✔ Download and installation successful
You can now load the model via spacy.load('en_core_web_lg')

However then when i try to load the model

nlp = spacy.load('en_core_web_lg')

the following error is printed:

OSError: [E050] Can't find model 'en_core_web_lg'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

Could anyone help me with this problem?

Davide Fiocco
  • 5,350
  • 5
  • 35
  • 72
Jithin P James
  • 752
  • 1
  • 7
  • 23
  • 1
    As of June 2022, executing ```import spacy.cli; spacy.cli.download("en_core_web_lg"); nlp = spacy.load('en_core_web_lg')``` on colab.to doesn't error anymore. Looks to me that the problem vanished! – Davide Fiocco Jun 20 '22 at 21:42

4 Answers4

81

Running

import spacy.cli
spacy.cli.download("en_core_web_lg")
nlp = spacy.load("en_core_web_lg")

shouldn't yield any errors anymore with recent spaCy versions.

If running the code still gives errors, you should be all set with running in one cell (takes a while, but gives you visual feedback about progress, differently from spacy.cli)

!python -m spacy download en_core_web_lg

Then, *** restart the colab runtime *** via

  • the colab menu Runtime > Restart runtime, or
  • use the keyboard shortcut Ctrl+M .

After that, executing

import spacy
nlp = spacy.load('en_core_web_lg')

should work flawlessly.

Davide Fiocco
  • 5,350
  • 5
  • 35
  • 72
21

In Google Colab Notebooks, you should import the model as a package.

However you download and install the model:

!pip install <model_s3_url> # tar.gz file e.g. from release notes like https://github.com/explosion/spacy-models/releases//tag/en_core_web_lg-2.3.1
!pip install en_core_web_lg
import spacy

you don't have permission in Colab to load the model with normal spacy usage:

nlp = spacy.load("en_core_web_lg") # not via packages
nlp = spacy.load("/path/to/en_core_web_lg") #not via paths
nlp = spacy.load("en") # nor via shortcut links
spacy.load()

Instead, import the model and load it directly:

import en_core_web_lg
nlp = en_core_web_lg.load()

Then use as directed:

doc = nlp("This is a sentence. Soon, it will be knowledge.")
Davide Fiocco
  • 5,350
  • 5
  • 35
  • 72
Briggsly
  • 211
  • 2
  • 4
6

It seems the best answer is on this thread: How to install models/download packages on Google Colab?

import spacy.cli
spacy.cli.download("en_core_web_lg")
import en_core_web_lg
nlp = en_core_web_lg.load()
Davide Fiocco
  • 5,350
  • 5
  • 35
  • 72
GoPackGo
  • 341
  • 5
  • 9
0

I ran into a similar issue on google colab with:

nlp = spacy.load('en_core_web_md') 

I suspect it may have something to do with the size of the model. It worked for me using the small spacy model.

spacy download en_core_web_sm
nlp = spacy.load('en_core_web_sm')