I first implemented query search using SimpleQueryString shown as follows.
Entity Definition
@Entity
@Indexed
@AnalyzerDef(name = "whitespace", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class)
})
public class AdAccount implements SearchableEntity, Serializable {
@Id
@DocumentId
@Column(name = "ID")
@GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
@Field(store = Store.YES, analyzer = @Analyzer(definition = "whitespace"))
@Column(name = "NAME")
private String name;
//other properties and getters/setters
}
I use the white space tokenizer factory here because the default standard analyzer ignores special characters, which is not ideal in my use case. The document I referred to is https://lucene.apache.org/solr/guide/6_6/tokenizers.html#Tokenizers-WhiteSpaceTokenizer. In this document it states that Simple tokenizer that splits the text stream on whitespace and returns sequences of non-whitespace characters as tokens.
SimpleQueryString Method
protected Query inputFilterBuilder() {
SimpleQueryStringMatchingContext simpleQueryStringMatchingContext = queryBuilder.simpleQueryString().onField("name");
return simpleQueryStringMatchingContext
.withAndAsDefaultOperator()
.matching(searchRequest.getQuery() + "*").createQuery();
}
searchRequest.getQuery() returns the search query string, then I append the prefix operator in the end so that it supports prefix query.
However, this does not work as expected with the following example. Say I have an entity whose name is "AT&T Account", when searching with "AT&", it does not return this entity.
I then made the following changes to directly use a white space analyzer. This time searching with "AT&" works as expected. But the search is case sensitive now, i.e, searching with "at&" returns nothing now.
@Field
@Analyzer(impl = WhitespaceAnalyzer.class)
@Column(name = "NAME")
private String name;
My questions are:
Why doesn't it work when I use the white space factory in my first attempt? I assume using the factory versus using the actual analyzer implementation is different?
How to make my search case-insensitive if I use the @Analyzer annotation as in my second attempt?