The ruby folks have Ferret. Someone know of any similar initiative for Python? We're using PyLucene at current, but I'd like to investigate moving to pure Python searching.
-
1Probably not an answer to the question, but Elasticsearch implements a simple web interface on top of Lucene, and PyES is a python wrapper over Elasticsearch. I have used pyES comfortably, but some advanced features present in Lucene are still missing from Elasticsearch. – amit kumar Sep 06 '11 at 06:32
-
By the way, the old Ferret URL redirects now to http://www.chandanweb.com/solutions/web-applications.html - I've replaced the URL with the new github page https://github.com/dbalmain/ferret :) – icedwater Sep 05 '13 at 04:12
-
For accessing Lucene indices I found (and am trying out) `plush`: https://pypi.python.org/pypi/plush/0.3.0 – icedwater Sep 05 '13 at 04:33
-
any reason for going for pure python? – avi Feb 06 '14 at 05:47
8 Answers
-
2Just used whoosh for a project and it really was easy to use. No messing around at all - just worked. – John Montgomery Jul 17 '09 at 15:49
-
7Unfortunately, whoosh seems to be abandoned now (and has a lot of bad bugs). – OrangeDog Jul 31 '18 at 10:09
-
2
-
also updating/deleting documents does not really work. search still returns those deleted/overwritten documents. – user1050755 Sep 04 '22 at 08:10
I recently found pyndexter. It provides abstract interface to various different backend full-text search engines/indexers. And it ships with a default pure-python implementation.
These things can be disastrously slow though in Python.

- 40,967
- 12
- 95
- 109
-
I came here looking for something to access Lucene indices in python, I'm not too concerned about speed at this point. I just don't want to be tied to Java. So thanks for the pynter. – icedwater Sep 05 '13 at 04:26
-
2Last release of pyndexter was 2007 and the link provided here is dead, unfortunately. – webtweakers Nov 15 '16 at 13:39
For some applications pure Python is overrated. Take a look at Xapian.

- 776,304
- 153
- 1,341
- 1,358
-
1Thanks for the Xapian mention. Not what I need right now, but I'll sure keep it in mind for later. – PEZ Jan 13 '09 at 22:55
For non-pure Python, Sphinx Search with Python API works the fastest. From the benchmarks from multiple blogs, Sphinx Search is way faster than Lucene, uses way less memory and it is in C.
I am developing a multi-document search engine based on it, using python and web2py as framework.

- 4,701
- 3
- 35
- 50

- 6,673
- 12
- 41
- 55
lupy was a lucene port to pure python.The lupy people suggest that you use PyLucene. Sorry. Maybe you can use the Java sources in combination with Jython.

- 20,565
- 5
- 44
- 69
-
It's interesting that Ferret seems to be very appreciated and used while Lupy was abandoned. – PEZ Jan 13 '09 at 09:22
-
Well, PyLucene seems to cater to a similar community. Also, some people are even ready to do their full-text searches in Java because of Lucene ;-) – Yuval F Jan 13 '09 at 09:42
+1 to the Xapian and Pyndexter answers.
Ferret is actually written in C with Ruby bindings on top. A pure Ruby search engine would be even slower than a pure Python one. I would love to see "someone else" write a Cython/Pyrex layer for Python interface to Ferret, but won't do it myself because why bother when there are Python bindings for Xapian.

- 43,536
- 9
- 71
- 81
-
1Thanks. I used the term "pure" in a dirty way. =) If I can install it with easy_setup of the like I'm happy. – PEZ Feb 07 '09 at 11:38
After weeks of searching for this, I found a nice Python solution: repoze.catalog. It's not strictly Python-only because it uses ZODB for storage, but it seems a better dependency to me than something like SOLR.

- 40,967
- 12
- 95
- 109
-
I want Solr in python, what was a conclusion about software like Solr but written in Python? – tursunWali Feb 22 '21 at 22:05