0

I try to understand how the lucene query syntax works so I wrote this small program. When using a NumericRangeQuery I can find the documents I want but when trying to parse a search condition, it can't find any hits, although I'm using the same conditions. i understand the difference can be explained by the analyzer but the StandardAnalyzer is used which does not remove numeric values.

Can someone tell me what I'm doing wrong ? Thanks.

package org.burre.lucene.matching;

import java.io.IOException;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.NumericRangeQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.*;
import org.apache.lucene.util.Version;

public class SmallestEngine {
  private static final Version VERSION=Version.LUCENE_48;
  private StandardAnalyzer analyzer = new StandardAnalyzer(VERSION);
  private Directory index = new RAMDirectory();

  private Document buildDoc(String name, int beds) {
    Document doc = new Document();
    doc.add(new StringField("name", name, Field.Store.YES));
    doc.add(new IntField("beds", beds, Field.Store.YES));
    return doc;
  }

  public void buildSearchEngine() throws IOException {
    IndexWriterConfig config = new IndexWriterConfig(VERSION,
            analyzer);

    IndexWriter w = new IndexWriter(index, config);
    // Generate 10 houses with 0 to 3 beds
    for (int i=0;i<10;i++)
        w.addDocument(buildDoc("house"+(100+i),i % 4));
    w.close();
  }
  /**
   * Execute the query and show the result
   */
  public void search(Query q) throws IOException {
    System.out.println("executing query\""+q+"\"");
    IndexReader reader = DirectoryReader.open(index);
    try {
        IndexSearcher searcher = new IndexSearcher(reader);
        ScoreDoc[] hits = searcher.search(q, 10).scoreDocs;
        System.out.println("Found " + hits.length + " hits.");
        for (int i = 0; i < hits.length; ++i) {
            int docId = hits[i].doc;
            Document d = searcher.doc(docId);
            System.out.println(""+(i+1)+". " + d.get("name") + ", beds:"
                    + d.get("beds"));
        }
    } finally {
        if (reader != null)
            reader.close();
    }
  }

  public static void main(String[] args) throws IOException, ParseException {
    SmallestEngine me = new SmallestEngine();
    me.buildSearchEngine();
    System.out.println("SearchByRange");
    me.search(NumericRangeQuery.newIntRange("beds", 3, 3,true,true));
    System.out.println("-----------------");
    System.out.println("SearchName");
    me.search(new QueryParser(VERSION,"name",me.analyzer).parse("house107"));
    System.out.println("-----------------");
    System.out.println("Search3Beds");
    me.search(new QueryParser(VERSION,"beds",me.analyzer).parse("3"));
    System.out.println("-----------------");
    System.out.println("Search3BedsInRange");
    me.search(new QueryParser(VERSION,"name",me.analyzer).parse("beds:[3 TO 3]"));
   }
}

The output of this program is:

SearchByRange
executing query"beds:[3 TO 3]"
Found 2 hits.
1. house103, beds:3
2. house107, beds:3
-----------------
SearchName
executing query"name:house107"
Found 1 hits.
1. house107, beds:3
-----------------
Search3Beds
executing query"beds:3"
Found 0 hits.
-----------------
Search3BedsInRange
executing query"beds:[3 TO 3]"
Found 0 hits.
Conffusion
  • 4,335
  • 2
  • 16
  • 28

3 Answers3

1

You need to use NumericRangeQuery to perform a search on the numeric field.

The answer here could give you some insight.

Also the answer here says

for numeric values (longs, dates, floats, etc.) you need to have NumericRangeQuery. Otherwise Lucene has no idea how do you want to define similarity.

Community
  • 1
  • 1
manal
  • 77
  • 8
0

What you need to do is to write your own QueryParser:

public class CustomQueryParser extends QueryParser {

    // ctor omitted 

    @Override
    public Query newTermQuery(Term term) {
        if (term.field().equals("beds")) {
           // manually construct and return non-range query for numeric value
        } else {
           return super.newTermQuery(term);
        }
    }

    @Override
    public Query newRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) {
        if (field.equals("beds")) {
           // manually construct and return range query for numeric value
        } else {
           return super.newRangeQuery(field, part1, part2, startInclusive, endInclusive);
        }
    }
}
mindas
  • 26,463
  • 15
  • 97
  • 154
  • 1
    A little but disappointed that Lucene is not able to interprete a numeric condition. Your solution helped me the best. My implementation just works for every numeric field (not only for beds:) if (StringUtils.isNumeric(term.text())) { return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1),Integer.parseInt(part2),part1Inclusive,part2Inclusive); } – Conffusion May 22 '14 at 15:49
  • You expect magic from Lucene, remember that Lucene is a library and not a standalone product. The feature you want would make sense for Solr or Elasticsearch. Anyway, what this class does is to say "if the field name is X, then construct numeric query". Moreover, it allows you to plug into `QueryParser` mechanism seamlessly: you only need to provide field name and don't have to parse the query yourself. I think that's not too much. – mindas May 22 '14 at 15:56
  • p.s. if you liked the answer, you might want to [accept it](http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work). This is how this site works. Thank you! – mindas May 22 '14 at 15:57
  • accepted your answer but please look at my post at the bottom. It describes a more generic solution for all numeric fields. – Conffusion May 22 '14 at 16:08
0

It seems like you always have to use the NumericRangeQuery for numeric conditions. (thanks to Mindas) so as he suggested I created My own more intelligent QueryParser. Using the Apache commons-lang function StringUtils.isNumeric() I can create a more generic QueryParser:

public class IntelligentQueryParser extends QueryParser {
    // take over super constructors
@Override
protected org.apache.lucene.search.Query newRangeQuery(String field,
        String part1, String part2, boolean part1Inclusive, boolean part2Inclusive) {
    if(StringUtils.isNumeric(part1))
    {
        return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1),Integer.parseInt(part2),part1Inclusive,part2Inclusive);
    }
    return super.newRangeQuery(field, part1, part2, part1Inclusive, part2Inclusive);
}

@Override
protected org.apache.lucene.search.Query newTermQuery(
        org.apache.lucene.index.Term term) {
    if(StringUtils.isNumeric(term.text()))
    {
        return NumericRangeQuery.newIntRange(term.field(), Integer.parseInt(term.text()),Integer.parseInt(term.text()),true,true);
    }
    return super.newTermQuery(term);
}
}

Just wanted to share this.

Conffusion
  • 4,335
  • 2
  • 16
  • 28