3

I am using Lucene 6.6.0 and I would like to use the near real-time search feature of Lucene. However, I could not manage to implement it. The way I try to get the feature is as follows:

I initialize an IndexReader instance:

this.reader = DirectoryReader.open(this.directory);

Let's assume some changes have been made in the index via an IndexWriter instance. Then, if I understand correctly, I need a second instance of IndexReader to commit updates:

this.newReader = DirectoryReader.openIfChanged(this.reader);
if (this.newReader != null) {
    // Update the IndexSearcher with the new IndexReader instance
    this.searcher = new IndexSearcher(this.newReader);
    this.reader.close();
}

The issue here is that the code does not compile because of the following error: The method openIfChanged(DirectoryReader) in the type DirectoryReader is not applicable for the arguments (IndexReader).

How should I update the IndexReader then ?

Secondly, if I update the index again, I will need another IndexReader instance, won't I ? Would the most optimal way to update the index freely during the execution of the program be by switching between 2 IndexReader instances after each update ?

Thank you.

Sabir Khan
  • 9,826
  • 7
  • 45
  • 98
C. Güzelhan
  • 153
  • 3
  • 12

1 Answers1

4

Try to use a SearcherManager instead of a IndexReader: http://lucene.apache.org/core/6_6_0/core/org/apache/lucene/search/SearcherManager.html

Based on the SearcherManager your able to execute following methods:

// get a IndexSearcher for searching
IndexSearcher searcher = searcherManager.aquire();

// release IndexSearcher after search
searcherManager.release(searcher);

// refresh and add new index records to next search. usually after a commit 
searcherManager.maybeRefresh();

I tried to implement this as well and basically i did this:

  • create an IndexWriter and leave it open
  • create a SearcherManager with the IndexWriter as param.
  • use SearcherManager to search
  • use IndexWriter for indexing operations.
  • commit after indexing

Additionally you can use a separate thread to commit periodically and not on every write because the commit operation may be pretty "expensive".

Example here: http://www.lucenetutorial.com/lucene-nrt-hello-world.html

dom
  • 732
  • 7
  • 19
  • I did as you said and it seems to work well. When some changes occur, I call this.writer.maybeMerge(); this.writer.commit(); this.searcherManager.maybeRefresh(); These 3 lines are necessary to be able to search in the updated index, aren't they ? I prefer to refresh the index manually by calling these methods after indexing instead of using a thread, because in my case indexing occurs only in the beginning, so there is no need to refresh it regularly. Using a thread to refresh the index would be optimal when the index is dynamic, wouldn't it ? – C. Güzelhan Jul 25 '17 at 11:57
  • commit and maybeRefresh are necessary yes. not sure if maybeMerge is really needed but you have to check that. hmm hang on, why do you wanna implement a NRT use case if you only index once? you don't have any changes on the index after initial indexing? – dom Jul 25 '17 at 13:30
  • Technically, no, but I am only indexing documents belonging to certain file types. When I execute my program and new file types can also be indexed, I do not want to construct the index from scratch but to update the existing one with new documents. I also want the ability to update the index even after the initial update in case I may need it later. – C. Güzelhan Jul 25 '17 at 14:36
  • ok thats fine. in this case a NRT implementation is usefull. According to your question: The dedicated thread makes sense in case if you have a lot of changes in small time delta. in this case doing a commit can lock other threads because they have to wait till commit is done. so in this case it would make sense to have a dedicated thread which is commiting like every 5 seconds for example. – dom Jul 26 '17 at 07:30