I have a django 1.8 view that looks like this:
def sourcedoc_parse(request, sourcedoc_id):
sourcedoc = Sourcedoc.objects.get(pk=sourcedoc_id)
nltk.data.path.append('/root/nltk_data')
new_words = []
english_vocab = set(w.lower() for w in nltk.corpus.gutenberg.words()) #<---the line where the error occurs
results = {}
template = 'sourcedoc_parse.html'
params = {'sourcedoc': sourcedoc,'results': results, 'new_words': new_words, 'BASE_URL': BASE_URL}
return render_to_response(template, params, context_instance=RequestContext(request))
It gives me the following error:
Django Version: 1.8
Python Version: 2.7.6
...
Traceback:
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in get_response
132. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/rosshartshorn/htdocs/worldmaker/sourcedocs/views.py" in sourcedoc_parse
107. english_vocab = set(w.lower() for w in nltk.corpus.gutenberg.words())
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py" in __getattr__
68. self.__load()
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py" in __load 56. except LookupError: raise e
Exception Type: LookupError at /sourcedoc/parse/13/
Exception Value:
**********************************************************************
Resource 'corpora/gutenberg' not found. Please use the NLTK
Downloader to obtain the resource: >>> nltk.download()
Searched in:
- '/var/www/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/root/nltk_data'
**********************************************************************
What is especially odd is that it works fine when I do it in the same directory in the python shell, it works fine:
Python 2.7.6 (default, Mar 22 2014, 22:59:38)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> english_vocab = set(w.lower() for w in nltk.corpus.gutenberg.words())
>>> 'jabberwocky' in english_vocab
False
>>> 'monster' in english_vocab
True
>>> nltk.data.path
['/root/nltk_data', '/usr/share/nltk_data', '/usr/local/share/nltk_data', '/usr/lib/nltk_data', '/usr/local/lib/nltk_data']
Does anyone have an idea what is the source of the difference between running it inside a view in django, and doing the same thing at the python command line? I've done the same thing using 'python manage.py shell', and it also works that way.
Any debugging advice on finding the difference is also welcome.