0

I have a junit test for Lucene 5.3.1 , which confuses me. I am trying to test search for field "code". Every doc in index has this field set, so search should return all docs in index. But i get no results. When i change "code" field to Textfield, everything is OK. Here's the test:


public class LuceneTest {
  private static final String CODE_FIELD_NAME = "code";
  private static final String ID_FIELD_NAME = "id";
  private static final String CODE_VALUE = "Address"; //$NON-NLS-1$
  private static final String TAGS_FIELD_NAME = "tags"; //$NON-NLS-1$

  @Test
  public void testCode() {
    Directory directory = new RAMDirectory();
    Analyzer analyzer = new StandardAnalyzer();
    try {
      createDocuments(analyzer, directory);
      IndexReader reader = DirectoryReader.open(directory);
      IndexSearcher searcher = new IndexSearcher(reader);
      List<Document> result = null;
      // code
      result = search("code:" + CODE_VALUE, searcher,
          analyzer);
      Assert.assertEquals(6, result.size());
    } catch (Exception e) {
      e.printStackTrace();
      Assert.fail(e.getMessage());
    }
  }

  private static List<Document> search(String queryString,
      IndexSearcher searcher, Analyzer analyzer) throws ParseException,
      IOException {
    Query q = new QueryParser(TAGS_FIELD_NAME, analyzer).parse(queryString);
    TopDocs docs = searcher.search(q, 10);
    List<Document> res = new ArrayList<Document>();
    for (ScoreDoc d : docs.scoreDocs) {
      Document doc = searcher.doc(d.doc);
      res.add(doc);
    }
    return res;
  }

  /**
   * @param analyzer
   * @param directory
   * @throws CorruptIndexException
   * @throws LockObtainFailedException
   * @throws IOException
   */
  private static void createDocuments(Analyzer analyzer, Directory directory)
      throws CorruptIndexException, LockObtainFailedException, IOException {
    IndexWriterConfig conf = new IndexWriterConfig(analyzer);
    IndexWriter iwriter = new IndexWriter(directory, conf);
    createDocument(1L, "Unter den Linden", "1", "Berlin", iwriter);
    createDocument(2L, "Broadway", "32, 2/20", "New York", iwriter);
    createDocument(3L, "Main road", "16", "New Hampshire", iwriter);
    createDocument(5L, "Moselgasse", "15", "Wien", iwriter);
    iwriter.close();
  }

  private static Document createDocument(Long id, String houseNum,
      String street, String city, IndexWriter iwriter)
      throws CorruptIndexException, IOException {
    Document doc = new Document();
    doc.add(new TextField(TAGS_FIELD_NAME, houseNum, Store.NO));
    doc.add(new TextField(TAGS_FIELD_NAME, street, Store.NO));
    doc.add(new TextField(TAGS_FIELD_NAME, city, Store.NO));
    doc.add(new LongField(ID_FIELD_NAME, id, Store.YES));
    doc.add(new StringField(CODE_FIELD_NAME,
 CODE_VALUE, Store.NO));
    iwriter.addDocument(doc);
    return doc;
  }
}
george
  • 19
  • 7

1 Answers1

0

Please have a look at Solr Text field and String field - different search behaviour. The answer to that question is probably also the answer to your question. Short: StrFields cannot have any tokenization or analysis / filters applied, whereas TextFields can.

Community
  • 1
  • 1
Daniel Schneiter
  • 1,843
  • 1
  • 13
  • 19
  • I don't think its the problem. I want EXACT macht on "code" field and i use StringField, but still i get no results in my test. See: `doc.add(new StringField(CODE_FIELD_NAME, CODE_VALUE, Store.NO));` and later the query is `"code:" + CODE_VALUE` – george Jan 07 '16 at 11:51
  • When searching, doesn't `CODE_VALUE` get lowercased and therefore the match fails? What if you change the value to `address`? – Daniel Schneiter Jan 07 '16 at 12:45
  • Your're right,i have to get used to the logic. StringField stores exactly the value, but when searching StandardAnalyzer lowercase the search term and therefor no results found? – george Jan 07 '16 at 13:13
  • On Stack Overflow, please upvote questions and answers that you find useful and informative. – Daniel Schneiter Jan 08 '16 at 16:29
  • I don't get it: if a string field like "abCD0123" cannot be found because the analyzer searches "abcd0123" instead, string fields cannot be searched unless they're lower case? Or is there another way to store values as-is and find them back exactly? – zakmck May 31 '20 at 11:29