13

Am I able to integrate Apache Nutch crawler with the Solr Index server?

Edit:

One of our devs came up with a solution from these posts

  1. Running Nutch and Solr
  2. Update for Running Nutch and Solr

Answer

Yes

Scott Cowan
  • 2,652
  • 7
  • 29
  • 45

3 Answers3

6

If you're willing to upgrade to nutch 1.0 you can use the solrindex as described in this article by Lucid Imagination: http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/.

bdaniels
  • 233
  • 2
  • 9
1

nutch 2.x is designed to use solr as default. You can follow the steps in http://wiki.apache.org/nutch/Nutch2Tutorial, or a better instruction in the book "Web Crawling and Data Mining with Apache Nutch".

jinhong_lu
  • 238
  • 1
  • 2
  • 11
1

It's still an open issue. If you're feeling adventurous you could try applying those patches yourself, although it looks like it's not so simple

Mauricio Scheffer
  • 98,863
  • 23
  • 192
  • 275
  • ya I'm preparing a usergroup talk on lucene so I'll test out this setup. I was hoping there was a quick Y/N answer out there – Scott Cowan Dec 19 '08 at 11:09