8

I am a newb coder in a startup and I am implementing search of documents in a directory in a web host.

I am comparing Lucene/Solr, Whoosh, Sphinx and Xapian. Whoosh is natively python. But I want your opinions on it too. Which of these have

  • mature and easy to use and install interfaces with python? (Whoosh is a no-brainer)
  • no chance for crashes, bottlenecks and other failures
  • best documented interface (Im not reading PHP docs because python docs were sparse)
  • easiest to get up and running (only one has a quick-start tutorial)
Jesvin Jose
  • 22,498
  • 32
  • 109
  • 202

3 Answers3

2

Use Whoosh if you don't need the speed, extra features of the alternatives. It's great, has a nice API, good documentation. My second choice would probably be Xapian, which is fast and has a fairly decent API. They are all fairly mature products. If you don't know what you really need, I'd just go with Whoosh for now.

Zach Kelling
  • 52,505
  • 13
  • 109
  • 108
2

If you want quick python integration, try indextank. You can be up and running in 2 minutes, and it's free.

For the other alternatives, I'd go with Solr (provided you want to host the search servers yourself, or signup for websolr )

Disclaimer: I work at indextank.

dbuthay
  • 31
  • 1
  • IndexTank is now open source: http://engineering.linkedin.com/open-source/indextank-now-open-source – javanna Dec 29 '11 at 11:58
2

Speaking for Apache Solr, Python has several Solr clients, which I've collected based on feedback from our customers at Websolr:

  1. Haystack is very popular, and designed for seamless integration within Django apps. If you're developing a Django app, Haystack is for you.
  2. Sunburnt looks to be more generic than Haystack, and is also very well documented. If you're doing plain ol' Python, Sunburnt is worth a look.

Other Python Solr clients that I've found, which seem a bit lower level...

Some more details about how your app is built (in particular, is it a Django app?) would help narrow things down from here. Good luck finding the best fit for your app!

Nick Zadrozny
  • 7,906
  • 33
  • 38
  • I decided to do it the generic XML/HTTP way. Its not a DJango app. Its a backend task which would have been too tedious to do in PHP. – Jesvin Jose Jul 27 '11 at 05:47