How to do POS tag a sentence using Python

Question

Possible Duplicate:
Failed loading english.pickle with nltk.data.load

This is the problem I faced when want to do POS tagging even though I already import the item which are required. So not sure what is the problem which cannot print the output. Can anyone help me point out what is wrong with my code?

>>> import nltk
>>> import nltk.corpus
>>> from nltk.corpus import brown
>>> from nltk.corpus import treebank
>>> import nltk.tag
>>> from nltk import tokenize
>>> from nltk import word_tokenize
>>> from nltk import pos_tag
>>> text=nltk.word_tokenize("Historians have scant knowledge about Borneo's earl
y history, a certain fact though is the presence of modern man in Sarawak some 4
0,000 years ago (discovery of a Homo Sapiens skull at the Niah Caves), but most
of today's indigenous populations belong to the same Austronesian groups, brough
t by maritime migratory waves in the last 5,000 or so years, who have settled al
ong the Malayan peninsula, the Indonesian, Philippine, Micronesian and Polynesia
n archipelagos, and as far as Madagascar to the west and Easter Island to the ea
st.")
 >>> nltk.pos_tag(text)

Error:

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 99, in pos_tag

        tagger = load(_POS_TAGGER)
    File "C:\Python27\lib\site-packages\nltk\data.py", line 605, in load
        resource_val = pickle.load(_open(resource_url))
    File "C:\Python27\lib\site-packages\nltk\data.py", line 686, in _open
        return find(path).open()
    File "C:\Python27\lib\site-packages\nltk\data.py", line 467, in find
        raise LookupError(resource_not_found)
LookupError:
**********************************************************************
    Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not
    found.  Please use the NLTK Downloader to obtain the resource:
    >>> nltk.download()
    Searched in:
        - 'C:\\Users\\user/nltk_data'
        - 'C:\\nltk_data'
        - 'D:\\nltk_data'
        - 'E:\\nltk_data'
        - 'C:\\Python27\\nltk_data'
        - 'C:\\Python27\\lib\\nltk_data'
        - 'C:\\Users\\user\\AppData\\Roaming\\nltk_data'
**********************************************************************

score 4 · Answer 1 · answered Oct 14 '12 at 06:55

Like the error says, you need to use the NLTK Downloader to download the resource taggers/maxent_treebank_pos_tagger/english.pickle.

You can do this by running import nltk; nltk.download() from a Python shell. The file you need is under the Models tab, named maxent_treebank_pos_tagger.

How to do POS tag a sentence using Python

1 Answers1