Is there any algorithm or way you could think of to determine the least important word to the meaning of a sentence? More generally, is there any way to assign some number to each word based on its importance in a sentence? By "importance" I mean that if you were to remove this word from the sentence it would have little effect to the meaning (low importance) or a large effect to the meaning (high importance).
Asked
Active
Viewed 472 times
0
-
What is your definition of *importance* here? Are you trying to say that if you were to remove the word, the sentence could still be understood? – idjaw Nov 24 '16 at 19:09
-
@idjaw I edited the question to make it clearer. – Dylan Siegler Nov 24 '16 at 19:14
1 Answers
2
This is a very vague question. From what I understand, you want to do something like keyword extraction.
POS Tagging is a good start. It lets you tag sentences to their parts of speech (Nouns, verbs adjectives etc) - POS Tag NLTK. You can then write your own rules to extract just the parts of speech that interest you.
Stopword Removal is another option
Keyword Extraction does a bunch of stuff you can read with examples -
chunking
chinking
named entity recognition
Building CFGs and parse trees
Relation Extraction
I think reading this chapter will give the perspective and the code snippets to get you started.

Community
- 1
- 1

Vivek Kalyanarangan
- 8,951
- 1
- 23
- 42