Corpora/stopwords not found when import nltk library

Question

I trying to import the nltk package in python 2.7

  import nltk
  stopwords = nltk.corpus.stopwords.words('english')
  print(stopwords[:10])

Running this gives me the following error:

LookupError: 
**********************************************************************
Resource 'corpora/stopwords' not found.  Please use the NLTK
Downloader to obtain the resource:  >>> nltk.download()

So therefore I open my python termin and did the following:

import nltk  
nltk.download()

Which gives me:

showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml

However this does not seem to stop. And running it again still gives me the same error. Any thoughts where this goes wrong?

Kurt Bourbaki · Answer 1 · 2019-11-14T11:11:35.623

161

You are currently trying to download every item in nltk data, so this can take long. You can try downloading only the stopwords that you need:

import nltk
nltk.download('stopwords')

Or from command line (thanks to Rafael Valero's answer):

python -m nltk.downloader stopwords

Reference:

Installing NLTK Data - Command line installation

edited Nov 14 '19 at 11:11

answered Jan 13 '17 at 18:06

Kurt Bourbaki

11,984
6
35
53

Rafael Valero · Answer 2 · 2018-03-21T09:02:41.590

46

The some as mentioned here by Kurt Bourbaki but in the command line:

python -m nltk.downloader stopwords

edited Mar 21 '18 at 09:02

answered Mar 01 '18 at 11:35

Rafael Valero

2,736
18
28

score 14 · Answer 3 · edited Aug 22 '18 at 07:36

14

You can do this in separately in console.
It will give you a result.

import nltk
nltk.download('stopwords')

I used jupyter console when I faced this problem.

edited Aug 22 '18 at 07:36

L_J

2,351
10
23
28

answered Aug 22 '18 at 06:14

Umesh

187
1
2

6

How is this answer different from the accepted answer? – EliadL Jan 27 '19 at 09:36

score 11 · Answer 4 · answered Oct 26 '20 at 12:18

if you get an SSL/Certificate error, run the following command.

This works by disabling SSL check!

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

score 5 · Answer 5 · edited Jun 21 '18 at 19:51

5

If your PC uses proxy for connectivity, then try this:

import nltk

nltk.set_proxy('http://proxy.example.com:3128', ('USERNAME', 'PASSWORD'))
nltk.download('stopwords')

edited Jun 21 '18 at 19:51

koPytok

3,453
1
14
29

answered Jun 21 '18 at 17:49

R Kumar

471
1
6
6

score 3 · Answer 6 · answered Nov 26 '21 at 06:28

3

Use GPU runtime, it will not give you any error.

The same code will work which you are using

import nltk
stopwords = nltk.corpus.stopwords.words('english')
print(stopwords[:10])

answered Nov 26 '21 at 06:28

deevas

31
2

score 1 · Answer 7 · edited Dec 22 '20 at 08:03

1

I know the comment is quite late, but if it helps:

Although the nltk.download('stopwords') will do the job, there might be times when it won't work due to proxy issues if your organization has blocked it.

I found this github link pretty handy, from where I can just pick up the list of words and integrate it manually in my project just as a workaround.

edited Dec 22 '20 at 08:03

MattAllegro

6,455
5
45
52

answered Dec 22 '20 at 07:39

factorThis

74
5

score 0 · Answer 8 · answered Oct 29 '19 at 19:52

showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml

If you are running this command in a jupyter notebook, it opens another window titled 'NLTK Downloader'. Once you go in that window, you can select the topics you want to download and then click on download button to start downloading.

Until you close the NLTK Downloader window, the cell in the Jupyter keeps on running.

score 0 · Answer 9 · answered Mar 13 '22 at 07:04

check what error you are getting --

python3 -m nltk.downloader stopwords

Error :

RuntimeWarning: 'nltk.downloader' found in sys.modules after import of package 'nltk', but prior to execution of 'nltk.downloader'; this may result in unpredictable behaviour


warn(RuntimeWarning(msg))
[nltk_data] Error loading stopwords: <urlopen error [SSL:
[nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed:
[nltk_data]     unable to get local issuer certificate (_ssl.c:1123)>

Use the solution provided my @reshma2k

score 0 · Answer 10 · answered Jul 13 '22 at 02:04

0

Installed the ntlk and imported the stopwords

!pip3 install nltk
import nltk
nltk.download('stopwords')

answered Jul 13 '22 at 02:04

Karthikeyan VK

5,310
3
37
50

score 0 · Answer 11 · answered May 12 '23 at 08:29

in my case after running

import nltk
nltk.download('stopwords')

it did not work. The issue was wordnet.zip was unabale to unzip on its own so simple go to folder wherepython3 -m textblob.download_corpora this command installed package and unzip folder

cd ~
cd nltk_data/corpora/
unzip stopwords.zip

score -1 · Answer 12 · answered Jul 23 '23 at 18:16

-1

import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
words=stopwords.words('english')[0:20]
print(words)

answered Jul 23 '23 at 18:16

VIDYA RENUKA

91
1
4

2

Answer needs supporting information Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](https://stackoverflow.com/help/how-to-answer). – moken Jul 24 '23 at 08:31

Corpora/stopwords not found when import nltk library

12 Answers12

Reference:

Linked

Related