1

Lucene is a popular text indexing tool (http://lucene.apache.org/). But installing lucene for pythonic usage is a heck of a work (Building Pylucene on ubuntu 14.04(trusty tahr)).

Whoosh is a python based indexing library (https://pythonhosted.org/Whoosh/quickstart.html) that supports full text search like that of lucene.

  • For the sake of users who don't want the hassle of installing pylucene to use an index i've built. Is there a way to port Lucene index into Whoosh? If so, how?

  • Other than Whoosh, I could get Lucene into MongoDB's gridFS and then try as much to replicate full text search in MongoDB. But how do I port a Lucene index into MongoDB? Is that even possible?

  • Humanly, one can use Luke (https://code.google.com/p/luke/) or Clue (https://github.com/javasoze/clue) to read the files but is there any other way to export Lucene index into a pythonically readable format? (without using pylucene)?

Community
  • 1
  • 1
alvas
  • 115,346
  • 109
  • 446
  • 738
  • 1
    What exactly is your situation? Do you have an already-indexed Lucene index of data that you no longer have access to? Your best bet is to ignore Lucene entirely and just reindex the data with the technology of your choice. – itsadok Dec 07 '14 at 07:12
  • 1
    I don't have the raw data but I have the indexed data that contains the raw data. – alvas Dec 07 '14 at 09:04
  • Have you tried using clue to export the data to text files? – itsadok Dec 07 '14 at 11:53

0 Answers0