What is the best way to achieve Lucene real-time indexing?
-
1Real time indexing of what? Could you explain a bit more what you are looking for. – Mikael Svenson Jun 18 '10 at 06:24
-
1Having spent 2 years working on this off and on in solr, I have to say: solr is just not the right platform when it comes to realtime indexing. Commits are very heavy on large indexes -- I've heard of large installs that can barely manage 1 commit per hour. That being said, a recent project called Lucandra may prove promising -- I can't find any docs now, but I thought I heard something about no need to commit. http://github.com/tjake/Lucandra#readme – Frank Farmer Jun 18 '10 at 06:28
-
1You have to explain more -- there is no specific meaning to "real-time": at minimum you need to specify what would be acceptable delay. Soft realtime (like, takes 5-10 seconds for updates to show) is not hard, for non-huge indexes, for example. – StaxMan Jul 23 '10 at 23:59
-
2Please don't write such confusing comments (about Solr). The question is about lucene. Solr's brokenness here (closing the indexwriter on commit, not using Lucene's NRT feature) is off-topic. – Robert Muir Nov 16 '11 at 11:38
5 Answers
Lucene has a feature called near-real-time search to address exactly this need.
It requires that your IndexReader is in the same JVM as your IndexWriter.
You make changes with the IndexWriter, and then open a reader directly from the writer using IndexReader.open(writer), or on older Lucene releases writer.getReader(). This call will normally be very fast (in proportion to how many changes you've made since last opening a reader) as it bypasses the costly commit normally required for opening a reader. It's able to search the un-committed changes in the writer.
This reader still searches a point-in-time snapshot from the writer, ie all changes as of when you opened it.

- 1,176
- 7
- 5
Obtain an index reader from the index writer.
Update: Looks like the current method is to open a directory reader using an index writer object.

- 3,941
- 1
- 21
- 17
-
-
3By allowing searches to find documents prior to a commit point. The reader obtained from the writer is continually updated as documents are added. – Adrian Conlon Jun 19 '10 at 07:42
-
1While trying to do IndexReader reader = indexWriter.getReader(); The method getReader() from the type IndexWriter is not visible. http://lucene.apache.org/core/4_7_2/core/org/apache/lucene/index/IndexWriter.html?is-external=true – Arun Chandrasekaran May 26 '14 at 11:18
-
Try Zoie

- 39
- 3
-
Zoie has promise, however I have found the documentation and code samples severely lacking. Further with Zoie, indexing is fast if the documents to be indexed are batched up, but indexing a few documents at a time is actually extremely slow. I have personally found raw Lucene to be an easier API to use. – Biju Kunjummen Jun 25 '11 at 13:10
The Lucene wiki has some information: http://wiki.apache.org/lucene-java/NearRealtimeSearch

- 6,908
- 1
- 37
- 34