-1

I need to use in a project, an opensource rss crawler and feed reader (or two different tools)in java if it's possible. I've seen many differents tools, do you know which one is the best.

Thanks by advance

Desnoxav
  • 354
  • 2
  • 7
  • 19
  • Doesn't this question help you: [RSS Feed parser library in Java](http://stackoverflow.com/questions/3020820/rss-feed-parser-library-in-java)? Or this: [Java RSS library](http://stackoverflow.com/questions/113063/java-rss-library)? – Baz Sep 13 '12 at 10:24
  • It's for an University project. We have to use an open sorce rss crawler to understand how crawling works. I've seen Heritrix which looks fine but I'm not sure. – Desnoxav Sep 13 '12 at 10:26
  • http://www.vogella.com/tutorials/RSSFeed/article.html – Petronella Sep 10 '18 at 12:51

1 Answers1

2

If you want complete search engine - look at Apache Nutch.

If you just want to understand principles of web crawling - read pretty simple introduction in "Programming collective intelligence" and more advanced introduction from "Introduction to information retrieval".

If you need parse rss and atom feeds - use Rome.

Also look at any scraper, for example Web-Harvest.

stemm
  • 5,960
  • 2
  • 34
  • 64