Questions tagged [pos-tagger]

A part-of-speech tagger, or POS tagger, is a concrete implementation of algorithms which associate discrete terms, as well as hidden parts of speech, in accordance with a set of descriptive tags, such as the identification of words as nouns, verbs, adjectives, adverbs, and so on. It often follows an approach based on Machine Learning (ML) techniques.

In corpus linguistics, part-of-speech tagging (POS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition, as well as its context—i.e. relationship with adjacent and related words in a phrase, sentence, or paragraph. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

Once performed by hand, POS tagging is now done in the context of computational linguistics, using algorithms which associate discrete terms, as well as hidden parts of speech, in accordance with a set of descriptive tags. POS-tagging algorithms fall into two distinctive groups: rule-based and stochastic. E. Brill's tagger, one of the first and most widely used English POS-taggers, employs rule-based algorithms.

586 questions
52
votes
5 answers

What Is the Difference Between POS Tagging and Shallow Parsing?

I'm currently taking a Natural Language Processing course at my University and still confused with some basic concept. I get the definition of POS Tagging from the Foundations of Statistical Natural Language Processing book: Tagging is the task of…
bertzzie
  • 3,558
  • 5
  • 30
  • 41
36
votes
3 answers

Python NLTK pos_tag not returning the correct part-of-speech tag

Having this: text = word_tokenize("The quick brown fox jumps over the lazy dog") And running: nltk.pos_tag(text) I get: [('The', 'DT'), ('quick', 'NN'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN'), ('the', 'DT'), ('lazy',…
faceoff
  • 901
  • 3
  • 11
  • 16
33
votes
4 answers

What does NN VBD IN DT NNS RB means in NLTK?

when I chunk text, I get lots of codes in the output like NN, VBD, IN, DT, NNS, RB. Is there a list documented somewhere which tells me the meaning of these? I have tried googling nltk chunk code nltk chunk grammar nltk chunk tokens. But I am not…
Knows Not Much
  • 30,395
  • 60
  • 197
  • 373
32
votes
7 answers

What is NLTK POS tagger asking me to download?

I just started using a part-of-speech tagger, and I am facing many problems. I started POS tagging with the following: import nltk text=nltk.word_tokenize("We are going out.Just you and me.") When I want to print 'text', the following…
Pearl
  • 759
  • 1
  • 6
  • 7
18
votes
3 answers

How to use OpenNLP with Java?

I want to POStag an English sentence and do some processing. I would like to use openNLP. I have it installed When I execute the command I:\Workshop\Programming\nlp\opennlp-tools-1.5.0-bin\opennlp-tools-1.5.0>java -jar opennlp-tools-1.5.0.jar…
shababhsiddique
  • 904
  • 3
  • 14
  • 40
16
votes
6 answers

spaCy token.tag_ full list

The official documentation of token.tag_ in spaCy is as follows: A fine-grained, more detailed tag that represents the word-class and some basic morphological information for the token. These tags are primarily designed to be good features for…
Daniel
  • 1,783
  • 2
  • 15
  • 25
15
votes
3 answers

How to apply pos_tag_sents() to pandas dataframe efficiently

In situations where you wish to POS tag a column of text stored in a pandas dataframe with 1 sentence per row the majority of implementations on SO use the apply method dfData['POSTags']= dfData['SourceText'].apply( lamda row:…
mobcdi
  • 1,532
  • 2
  • 28
  • 49
15
votes
6 answers

Existing API for NLP in C++?

Is/are there existing C++ NLP API(s) out there? The closest thing I have found is CLucene, a port of Lucene. However, it seems a bit obsolete and the documentation is far from complete. Ideally, this/these API(s) would permit tokenization, stemming…
merours
  • 4,076
  • 7
  • 37
  • 69
14
votes
2 answers

NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH environment variable

I am working on a project that requires me to tag tokens using nltk and python. So I wanted to use this. But came up with a few problems. I went through a lot of other already asked questions and other forums but I was still unable to get a soultion…
Spoorthi Marakkini
  • 141
  • 1
  • 1
  • 3
14
votes
3 answers

Multilingual NLTK for POS Tagging and Lemmatizer

Recently I approached to the NLP and I tried to use NLTK and TextBlob for analyzing texts. I would like to develop an app that analyzes reviews made by travelers and so I have to manage a lot of texts written in different languages. I need to do two…
Alessio Schiavelli
  • 161
  • 1
  • 1
  • 6
13
votes
4 answers

Extracting nationalities and countries from text

I want to extract all country and nationality mentions from text using nltk, I used POS tagging to extract all GPE labeled tokens but the results were not satisfying. abstract="Thyroid-associated orbitopathy (TO) is an autoimmune-mediated orbital…
user6453258
  • 191
  • 1
  • 1
  • 8
12
votes
2 answers

NLP for extracting actions from text

I'm hoping somebody can point me in the right direction to learn about separating out actions from a bunch of text. Suppose I have this text Drop off the dry cleaning, and go to the corner store and pick-up a jug of milk and get a pint of…
pedalpete
  • 21,076
  • 45
  • 128
  • 239
11
votes
4 answers

Stanford POS tagger in Java usage

Mar 9, 2011 1:22:06 PM edu.stanford.nlp.process.PTBLexer next WARNING: Untokenizable: � (U+FFFD, decimal: 65533) Mar 9, 2011 1:22:06 PM edu.stanford.nlp.process.PTBLexer next WARNING: Untokenizable: � (U+FFFD, decimal: 65533) Mar 9, 2011 1:22:06 PM…
KNsiva
  • 377
  • 2
  • 8
  • 19
11
votes
2 answers

How to check a word if it is adjective or verb using python nltk?

i have list of words like amazing, interesting, love, great, nice. And i want to check if word is adjective or verb , like "love" is verb and nice is adjective... How to do it using python, or nltk, any help ?
nizam uddin
  • 341
  • 2
  • 6
  • 15
11
votes
1 answer

Error using Stanford POS Tagger in NLTK Python

I am trying to use Stanford POS Tagger in NLTK but I am not able to run the example code given here http://www.nltk.org/api/nltk.tag.html#module-nltk.tag.stanford import nltk from nltk.tag.stanford import POSTagger st =…
B-Abbasi
  • 813
  • 2
  • 17
  • 38
1
2 3
39 40