27

I have an app that uses the Spacy model "en_core_web_sm". I have tested the app on my local machine and it works fine.

However when I deploy it to Heroku, it gives me this error:

"Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory."

My requirements file contains spacy==2.2.4.

I have been doing some research on this error and found that the model needs to be downloaded separately using this command: python -m spacy download en_core_web_sm

I have been looking for ways to add the same to my requirements.txt file but haven't been able to find one that works!

I tried this as well - added the below to the requirements file:

-e git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz#egg=en_core_web_sm==2.2.0

but it gave this error:

"Cloning git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz to /app/.heroku/src/en-core-web-sm

Running command git clone -q git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz /app/.heroku/src/en-core-web-sm fatal: remote error: explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz is not a valid repository name"

Is there a way to get this Spacy model to load from the requirements file? Or any other fix that is possible?

Thank you.

rohit0505
  • 375
  • 1
  • 3
  • 10
  • You're getting that error because that's an URL to a zip file... You need to pass an URL to a repository for git to be able to clone it... – Swetank Poddar May 09 '20 at 19:19
  • Thanks Swetank, I'm not able to figure out what that url would be. Would you be able to help please? Thank you so much in advance. – rohit0505 May 09 '20 at 19:36
  • The answer below has been edited to answer your question! :D – Swetank Poddar May 09 '20 at 19:54
  • Thanks Swetank, the edited answer still gives an error: " Cloning git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz to /tmp/pip-req-build-at911nv7 Running command git clone -q git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz /tmp/pip-req-build-at911nv7 fatal: remote error: explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz is not a valid repository name" – rohit0505 May 10 '20 at 06:29

3 Answers3

27

Add this in your deployment step, if using docker add in Dockerfile

pip3 install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz --user

EDIT

Add

spacy>=2.2.0,<3.0.0 https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz#egg=en_core_web_sm

in requirements.txt

Spacy Doc Refer Downloading and requiring model dependencies section

For more detail on how to add github-source see this and follow YPCrumble answer

tausif
  • 672
  • 1
  • 6
  • 15
  • Thanks Tausif, is there a way I can add this to the requirements file? I'm not using docker. – rohit0505 May 09 '20 at 19:34
  • Thanks Tausif, it still gives an error: " Cloning git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz to /tmp/pip-req-build-at911nv7 Running command git clone -q git://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz /tmp/pip-req-build-at911nv7 fatal: remote error: explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz is not a valid repository name" – rohit0505 May 10 '20 at 06:28
7

For en-core-web-sm == 3.0.0, this worked for me.

Replace the line "en-core-web-sm==3.0.0" with

en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0-py3-none-any.whl
R Kumar
  • 471
  • 1
  • 6
  • 6
6

Ok, so after some more Googling and hunting for a solution, I found this solution that worked:

I downloaded the tarball from the url that @tausif shared in his answer, to my local system.

Saved it in the directory which had my requirements.txt file.

Then I added this line to my requirements.txt file: ./en_core_web_sm-2.2.5.tar.gz

Proceeded with deploying to Heroku - it succeeded and the app works perfectly now.

rohit0505
  • 375
  • 1
  • 3
  • 10
  • check edit of my answer, that may be more cleaner way to do it if works. – tausif May 11 '20 at 18:24
  • Thank you so much Tausif, will test the latest edit to your answer in my next edit of the app and will revert here accordingly. – rohit0505 May 12 '20 at 19:12