1

I want to extract keywords from my text. My article must be any sort related to music, sports,agriculture etc. So what I want is to extract keywords from a paragraph. I want to do this in Java and I searched a lot but couldn't be able to find any good algorithm or procedures for doing this.

After searching I found out that there is keyword extraction algorithms in python. But I need to do this in Java. What I currently did is

1)Divided a paragraph into sentences.
2)Removed stop-words
3)calculated word frequency of each sentence.

But the problem is that we can't say that the sentence with max word frequency will be the main sentence. I'm planning to do summarizer too to extract main sentences from a paragraph..Now I'm totally stuck with this. Can anyone help me.Any help will be appreciated.

chopss
  • 771
  • 9
  • 19
  • might want to take a look at [this](http://stackoverflow.com/questions/17447045/java-library-for-keywords-extraction-from-input-text) – James H Jul 04 '14 at 06:17
  • How do you define "keyword"? – Nikhil Talreja Jul 04 '14 at 06:27
  • 1
    extracting the main topic words. I can't define it because it depends on each paragraph/text – chopss Jul 04 '14 at 06:41
  • You must have a definition if you'd like to search it with an algorithm. Btw there are tons of text processing libs out there, I'm sure you'll find an existing algorithm that suits your needs. – rlegendi Jul 04 '14 at 09:58

0 Answers0