Questions tagged [named-entity-extraction]
99 questions
23
votes
3 answers
Training n-gram NER with Stanford NLP
Recently I have been trying to train n-gram entities with Stanford Core NLP. I have followed the following tutorials - http://nlp.stanford.edu/software/crf-faq.shtml#b
With this, I am able to specify only unigram tokens and the class it belongs to.…

Arun A K
- 2,205
- 2
- 27
- 45
20
votes
2 answers
How to use DBPedia to extract Tags/Keywords from content?
I am exploring how I can use Wikipedia's taxonomy information to extract Tags/Keywords from my content.
I found articles about DBPedia. DBpedia is a community effort to extract structured information from Wikipedia and to make this information…

Pritam Raut
- 289
- 2
- 3
- 12
13
votes
6 answers
Extracting webpage information based on a template in Java
Right now I use Jsoup to extract certain information (not all the text) from some third party webpages, I do it periodically. This works fine until the HTML of certain webpage changes, this change leads to a change in the existing Java code, this is…

vikasing
- 11,562
- 3
- 25
- 25
12
votes
1 answer
What are the entity types for NLTK?
I've been trying to find the full list of entity types of NLTK. I was only able to find the most common ones on this page, but not the full list. Could you please share the full list of named entity types NLTK has?

Furkanicus
- 329
- 2
- 18
10
votes
3 answers
Methods for extracting locations from text?
What are the recommended methods for extracting locations from free text?
What I can think of is to use regex rules like "words ... in location". But are there better approaches than this?
Also I can think of having a lookup hash table table with…

Jack Twain
- 6,273
- 15
- 67
- 107
9
votes
1 answer
Difference between named entity recognition and resolution?
What is the difference between named entity recognition and named entity resolution? Would appreciate a practical example.

London guy
- 27,522
- 44
- 121
- 179
8
votes
4 answers
Entity extraction web services
Are there any paid or free named entity recognition web services available.
Basically I'm looking for something - where if I pass a text like:
"John had french fries at Burger King"
It should be identify - something along the lines:
Person:…

Gublooo
- 2,550
- 8
- 54
- 91
7
votes
2 answers
How can Stanford CoreNLP Named Entity Recognition capture measurements like 5 inches, 5", 5 in., 5 in
I'm looking to capture measurements using Stanford CoreNLP. (If you can suggest a different extractor, that is fine too.)
For example, I want to find 15kg, 15 kg, 15.0 kg, 15 kilogram, 15 lbs, 15 pounds, etc. But among CoreNLPs extraction rules, I…

Joshua Fox
- 18,704
- 23
- 87
- 147
6
votes
1 answer
How to recognize entities in text that is the output of optical character recognition (OCR)?
I am trying to do multi-class classification with textual data. Problem I am facing that I have unstructured textual data. I'll explain the problem with an example.
consider this image for example:
I want to extract and classify text information…

Vivek Mehta
- 2,612
- 2
- 18
- 30
6
votes
1 answer
Semi-automatic annotation tool - How to find RDF Triplets
I'm developing a semi-automatic annotation tool for medical texts and I am completely lost in finding the RDF triplets for annotation.
I am currently trying to use an NLP based approach. I have already looked into Stanford NER and OpenNLP and they…

Gavin Spencer
- 71
- 4
5
votes
1 answer
extending NLP entity extraction
We would like to identify from a simple search neighborhood and streets in various cities. We don't only use English but also various other Cyrillic languages. We need to be able to identify spelling mistakes of locations. When looking at python…

Dory Zidon
- 10,497
- 2
- 25
- 39
5
votes
2 answers
Extracting Product Attribute/Features from text
I've been assigned a task to extract features/attributes from product description.
Levi Strauss slim fit jeans
Big shopping bag in pink and gold
I need to be able to extract out attributes such as "Jeans" and "slim fit" or "shopping bag" and…

elric
- 63
- 4
5
votes
3 answers
Entity Extraction Library
I’m looking for a library that does text analysis and extract entities.
The type/classification of an entity is not critical, it’s the identification of something that’s worthwhile that is critical. The entities universe in this case is infinite,…

hi1869695
- 51
- 3
4
votes
4 answers
Fast algorithm to extract thousands of simple patterns out of large amounts of text
I want to be able to match efficiently thousands of regexps out of GBs of text knowing that most of these regexps will be fairly simple, like:
\bBarack\s(Hussein\s)?Obama\b
\b(John|J\.)\sBoehner\b
etc.
My current idea is to try to extract out of…

jp.
- 106
- 5
4
votes
5 answers
How do I do Entity Extraction in Lucene
I m trying to do Entity Extraction (more like matching) in Lucene. Here is a sample workflow:
Given some text (from a URL) AND a list people names, try to extract names of people from the text.
Note:
Names of people are not completely
…

ankimal
- 915
- 3
- 9
- 22