Extract product name from english text

Question

I want extract the names of products being sold from English text.

For example:

"I'm selling my xbox brand new"

"Selling rarely used 27 inch TV"

Should give me "xbox" and "27 inch TV"

The only thing I can think of at the moment is to hardcode in a giant list of important nouns and important adjectives: ['tv', 'fridge', 'xbox', 'laptop', etc]

Is there a better approach?

[NLP](http://en.wikipedia.org/wiki/Natural_language_processing) isn't easy. — NullUserException, Jan 24 '13 at 20:23
Seriously lanzz? Is the point of this site not to ask questions even when you have no clue where to start? Are algorithmic questions against the rules? — Razor Storm, Jan 24 '13 at 21:00
Unfortunately, "no clue where to start" is often also, pretty much by definition, too broad, andea recommendation question. [Quick googling](https://google.com/search?q=text+extract+product-names) should reveal that this is still a research problem. — tripleee, Jul 02 '18 at 08:49
Possible duplicate of [Text mining - extract name of band from unstructured text](https://stackoverflow.com/questions/6670498/text-mining-extract-name-of-band-from-unstructured-text) — tripleee, Jul 02 '18 at 08:51

score 1 · Accepted Answer · answered Jan 24 '13 at 20:27

It looks like nltk will give you a list of words and their parts of speech. Since you are only interested in nouns? this will provide you with them

>>> from nltk import pos_tag, word_tokenize
>>> pos_tag(word_tokenize("John's big idea isn't all that bad.")) 
[('John', 'NNP'), ("'s", 'POS'), ('big', 'JJ'), ('idea', 'NN'), ('is',
'VBZ'), ("n't", 'RB'), ('all', 'DT'), ('that', 'DT'), ('bad', 'JJ'),
('.', '.')]

Extract product name from english text

1 Answers1