NLTK Lookup Error

Question

While running a Python script using NLTK I got this:

Traceback (most recent call last):
  File "cpicklesave.py", line 56, in <module>
    pos = nltk.pos_tag(words)
  File "/usr/lib/python2.7/site-packages/nltk/tag/__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "/usr/lib/python2.7/site-packages/nltk/tag/perceptron.py", line 140, in __init__
    AP_MODEL_LOC = str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
  File "/usr/lib/python2.7/site-packages/nltk/data.py", line 641, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource u'taggers/averaged_perceptron_tagger/averaged_perceptro
  n_tagger.pickle' not found.  Please use the NLTK Downloader to
  obtain the resource:  >>> nltk.download()
  Searched in:
    - '/root/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

Can anyone explain the problem?

score 76 · Accepted Answer · edited May 23 '17 at 12:34

76

Use

>>> nltk.download()

to install the missing module (the Perceptron Tagger).

(check also the answers to Failed loading english.pickle with nltk.data.load)

edited May 23 '17 at 12:34

Community

1
1

answered Mar 08 '16 at 08:12

user2314737

27,088
20
102
114

3

nltk.download starts a large download of data: all, all-corpora, all-nltk, book, popular,test, and thirdparty – Golden Lion Dec 01 '20 at 15:51

Posuer · Answer 2 · 2017-05-11T23:40:44.167

46

First answer said the missing module is 'the Perceptron Tagger', actually its name in nltk.download is 'averaged_perceptron_tagger'

You can use this to fix the error

nltk.download('averaged_perceptron_tagger')

edited May 11 '17 at 23:40

answered Aug 13 '16 at 10:17

Posuer

461
4
6

14

it is `python -m nltk.downloader averaged_perceptron_tagger` if you want to download it from the command line – Papples Aug 04 '17 at 14:12

score 30 · Answer 3 · edited May 23 '17 at 12:26

30

TL;DR

import nltk
nltk.download('averaged_perceptron_tagger')

Or to download all packages + data + docs:

import nltk
nltk.download('all')

See How do I download NLTK data?

edited May 23 '17 at 12:26

Community

1
1

answered Apr 07 '17 at 02:01

alvas

115,346
109
446
738

1

Hi, may I know where this content will be saved after downloading all the nltk data by using `nltk.download("all ")` – Pyd Feb 20 '18 at 07:11
1

See https://stackoverflow.com/questions/22211525/how-do-i-download-nltk-data and more specifically https://stackoverflow.com/a/36383314/610569 – alvas Feb 20 '18 at 11:10
Hi, Its just downloads all the packages and stops...rest codes don't execute – Partha Paul Aug 27 '21 at 12:14

Lucas Azevedo · Answer 4 · 2022-09-01T11:06:32.750

15

Install all nltk resources in one line:

python3 -c "import nltk; nltk.download('all')"

the data will be saved at ~/nltk_data

Install only specific resource:

Substitute "all" for "averaged_perceptron_tagger" to install only this module.

python3 -c "import nltk; nltk.download('averaged_perceptron_tagger')"

edited Sep 01 '22 at 11:06

answered Mar 28 '19 at 16:36

Lucas Azevedo

1,867
22
39

1

nltk.download(), this is working fine. already marked as correct answer. – Shiv Shankar Mar 29 '19 at 03:46
@ShivShankar you need to do `python3` then `import nltk`, `nltk.download()` which will put you on a download prompt asking the package name that you want to download. It is not a wrong solution, it's just not as simple as the one I'm proposing. – Lucas Azevedo Mar 29 '19 at 10:52
for folks using google colab, the following version of Lucas Azevedo's answer worked for me import nltk; nltk.download('averaged_perceptron_tagger') – tmr May 17 '23 at 03:17

score 2 · Answer 5 · answered Feb 20 '18 at 13:03

Problem: Lookup error when extracting count vectorizer from scikit learn. Below is code snippet.

from sklearn.feature_extraction.text import CountVectorizer
bow_transformer = CountVectorizer(analyzer=text_process).fit(X)

Solution: Try to run the below code and then try to install the stopwords from corpora natural language processing toolkit!!

import nltk
nltk.download()

score 2 · Answer 6 · answered Apr 25 '19 at 06:04

You can download NLTK missing module just by

import nltk
nltk.download()

This will shows the NLTK download screen. If it shows SSL Certificate verify failed error. Then it should works by disabling SSL check with below code!

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

score 1 · Answer 7 · answered Aug 25 '22 at 13:00

1

Sorry! if I missed other editor but this is working fine in Google Colab

import nltk
nltk.download('all')

answered Aug 25 '22 at 13:00

akD

1,137
1
10
15

score 0 · Answer 8 · answered Sep 13 '19 at 03:05

0

Sometimes even by writing nltk.download('module_name'), it does not get downloaded. At those times, you can open python in interactive mode and then download by using nltk.download('module_name').

answered Sep 13 '19 at 03:05

Lucky Sunda

53
1
8

score 0 · Answer 9 · answered Feb 06 '23 at 10:11

0

You just need to download that module for nltk. The better way is to open a python command line and type

import nltk
nltk.download('all')

That's all.

answered Feb 06 '23 at 10:11

Shivaditya kr

1
1
4

This solution is already provided in [another answer](https://stackoverflow.com/a/73488002/2847330). If you have any different solution please mention it. If there is any change/extension required to the original answer, then edit it. – Azhar Khan Feb 11 '23 at 07:23
will you tell me where you are running your code? – Shivaditya kr Jun 16 '23 at 08:35

score 0 · Answer 10 · answered Feb 15 '23 at 09:11

if you have already executed python -m textblob.download_corpora if not first run import nltk nltk.downlod('all') or nltk.downlod('all-corpora') and if issue remains there then it might be because some packages are not being unzipp. in my case have to unzip wordnet as my error was Resource wordnet not found. Please use the NLTK Downloader to obtain the resource: solutioncd /home/app/nltk_data/corpora then unzip wordnet.zip

https://github.com/nltk/nltk/issues/3028

score 0 · Answer 11 · answered Jul 05 '23 at 12:53

I ran this code, and it works for me:

# install_certifi.py
#
# sample script to install or update a set of default Root Certificates
# for the ssl module.  Uses the certificates provided by the certifi package:
#       https://pypi.python.org/pypi/certifi

import os
import os.path
import ssl
import stat
import subprocess
import sys

STAT_0o775 = ( stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR
             | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP
             | stat.S_IROTH |                stat.S_IXOTH )


def main():
    openssl_dir, openssl_cafile = os.path.split(
        ssl.get_default_verify_paths().openssl_cafile)

    print(" -- pip install --upgrade certifi")
    subprocess.check_call([sys.executable,
        "-E", "-s", "-m", "pip", "install", "--upgrade", "certifi"])

    import certifi

    # change working directory to the default SSL directory
    os.chdir(openssl_dir)
    relpath_to_certifi_cafile = os.path.relpath(certifi.where())
    print(" -- removing any existing file or link")
    try:
        os.remove(openssl_cafile)
    except FileNotFoundError:
        pass
    print(" -- creating symlink to certifi certificate bundle")
    os.symlink(relpath_to_certifi_cafile, openssl_cafile)
    print(" -- setting permissions")
    os.chmod(openssl_cafile, STAT_0o775)
    print(" -- update complete")

if __name__ == '__main__':
    main()

score 0 · Answer 12 · answered Jul 18 '23 at 06:35

0

Open the terminal.
Type python and enter
Type import ntlk and enter
Type nltk.download('averaged_perceptron_tagger') and enter

It should fix your error.

answered Jul 18 '23 at 06:35

Ashok Chhetri

428
6
10

score -1 · Answer 13 · edited Sep 09 '20 at 15:59

-1

If you have not downloaded ntlk then firstly download ntlk and then use this nltk.download('punkt') it will give you the result.

edited Sep 09 '20 at 15:59

kk.

3,747
12
36
67

answered Sep 09 '20 at 13:37

Anisha Yadav

51
1
1

I don't believe the punkt tokenizer is the missing piece here. See, for example, Lucas' answer. – Andy Sep 09 '20 at 13:42

score -1 · Answer 14 · answered Feb 01 '21 at 13:40

-1

import nltk


nltk.download('vader_lexicon')

Use this this might work

answered Feb 01 '21 at 13:40

vishal kurre

1

NLTK Lookup Error

14 Answers14

Linked

Related