0

Is there any algorithm or way you could think of to determine the least important word to the meaning of a sentence? More generally, is there any way to assign some number to each word based on its importance in a sentence? By "importance" I mean that if you were to remove this word from the sentence it would have little effect to the meaning (low importance) or a large effect to the meaning (high importance).

Dylan Siegler
  • 742
  • 8
  • 23

1 Answers1

2

This is a very vague question. From what I understand, you want to do something like keyword extraction.

POS Tagging is a good start. It lets you tag sentences to their parts of speech (Nouns, verbs adjectives etc) - POS Tag NLTK. You can then write your own rules to extract just the parts of speech that interest you.

Stopword Removal is another option

Keyword Extraction does a bunch of stuff you can read with examples -

  1. chunking

  2. chinking

  3. named entity recognition

  4. Building CFGs and parse trees

  5. Relation Extraction

I think reading this chapter will give the perspective and the code snippets to get you started.

Community
  • 1
  • 1
Vivek Kalyanarangan
  • 8,951
  • 1
  • 23
  • 42