2

In hopes of facilitating the installation of a tool I have built, I would like to have an enviroment.yml file that could help install all the required dependencies. I do have one, however, Spacy language models require an additional download via (for example) python -m spacy download en_core_web_sm.

My issue is that I would like to have this model downloaded from a simple conda env create -f environment.yml. I know that pip packages can be installed via Conda, but do not know how to perform the "download" inside the environment.yml file. Thanks in advance for any help you can provide.

  • If I understand your question correctly, you wish to have your environment.yml file contain pip imports as well? Possible duplicate of: https://stackoverflow.com/questions/35245401/combining-conda-environment-yml-with-pip-requirements-txt. Other helpful info might be how to install pip in a conda environment, ``conda install -n myenv pip`` (more info here: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#using-pip-in-an-environment). – Elmstead Jun 06 '22 at 15:19
  • Are you saying that adding a few lines like "- spacy: en_core_web_sm" will behave the same as "python -m spacy download en_core_web_sm"? – AfonsoSalgadoSousa Jun 06 '22 at 15:29
  • The issue is different from the one you sent. Following that issue, these couple of lines should do the trick: "- spacy: en_core_web_sm", but they do not install the model. It might be because download and install are different. – AfonsoSalgadoSousa Jun 06 '22 at 15:35
  • @Elmstead no need for Pip - Conda Forge packages the models directly (see https://github.com/conda-forge/spacy-models-feedstock). – merv Jun 06 '22 at 15:50

1 Answers1

2

As the Spacy documentation of download command states:

DOWNLOADING BEST PRACTICES
The download command is mostly intended as a convenient, interactive wrapper – it performs compatibility checks and prints detailed messages in case things go wrong. It’s not recommended to use this command as part of an automated process. If you know which package your project needs, you should consider a direct download via pip, or uploading the package to a local PyPI installation and fetching it straight from there. This will also allow you to add it as a versioned package dependency to your project.

While it is possible to include PyPI dependencies in a Conda environment YAML, Conda Forge also publishes Spacy models as packages via the spacy-models-feedstock. In the example from OP, this would mean adding the package spacy-model-en_core_web_sm.

merv
  • 67,214
  • 13
  • 180
  • 245