I am trying to launch a django service using docker which uses nltk library. In the dockerfile I have called a setup.py which calls nltk.download. According to the logs I see during building the docker image this step runs successfully.
But when I run the docker image and try to connect to my django service, I get the error saying that nltk.download hasn't happened yet.
Dockerfile code -
RUN . ${PYTHON_VIRTUAL_ENV_FOLDER}/bin/activate && python ${PYTHON_APP_FOLDER}/setup.py
setup.py code -
import nltk
import os
nltk.download('stopwords', download_dir=os.getcwd() + '/nltk_data/')
nltk.download('wordnet', download_dir=os.getcwd() + '/nltk_data/')
Error:
**********************************************************************
Resource stopwords not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('stopwords')
Searched in:
- '/root/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/usr/src/venv/nltk_data'
- '/usr/src/venv/share/nltk_data'
- '/usr/src/venv/lib/nltk_data'
**********************************************************************
Any idea what is wrong here? Also, the same code works when I run it without docker.