We have a problem that at the moment we are not allowed to use ElasticSearch, so we need to implement a search function with MySQL. One desired feature is a prefixed, tokenized search, so a sentence like
"The quick brown fox jumped over the lazy dog" could be findable when you search for "jump". I think I would need to define a rule like (pseudocode):
(*)(beginning OR whitespace)(prefix)(*)
I assume it is possible to do that with JPA (Criteria API)? But what if we have two terms? All of them have to be combined by AND, e.g. the above rule should result in TRUE for both terms in at least one column. That means "jump fox" would result in a hit, but "jump rabbit" would not. Is that also possible with Criteria API?
Or do you know a better solution than Criteria API? I heard Hibernate can do LIKE queries more elegantly (with less code) but unfortunately we use EclipseLink.
Based on the answer below here is my full solution. It's all in one method to keep it simple here ("simple JPA criteria API" is an oxymoron though). If anyone wants to use it, consider some refactoring
public List<Customer> findMatching(String searchPhrase) {
List<String> searchTokens = TextService.splitPhraseIntoNonEmptyTokens(searchPhrase);
if (searchTokens.size() < 1 || searchTokens.size() > 5) { // early out and denial of service attack prevention
return new ArrayList<>();
}
CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
CriteriaQuery<Customer> criteriaQuery = criteriaBuilder.createQuery(Customer.class);
Root<Customer> rootEntity = criteriaQuery.from(Customer.class);
Predicate[] orClausesArr = new Predicate[searchTokens.size()];
for (int i = 0; i < searchTokens.size() ; i++) {
// same normalization methods are used to create the indexed searchable data
String assumingKeyword = TextService.normalizeKeyword(searchTokens.get(i));
String assumingText = TextService.normalizeText(searchTokens.get(i));
String assumingPhoneNumber = TextService.normalizePhoneNumber(searchTokens.get(i));
String assumingKeywordInFirstToken = assumingKeyword + '%';
String assumingTextInFirstToken = assumingText + '%';
String assumingPhoneInFirstToken = assumingPhoneNumber + '%';
String assumingTextInConsecutiveToken = "% " + assumingText + '%';
Predicate query = criteriaBuilder.or(
criteriaBuilder.like(rootEntity.get("normalizedCustomerNumber"), assumingKeywordInFirstToken),
criteriaBuilder.like(rootEntity.get("normalizedPhone"), assumingPhoneInFirstToken),
criteriaBuilder.like(rootEntity.get("normalizedFullName"), assumingTextInFirstToken),
// looking for a prefix after a whitespace:
criteriaBuilder.like(rootEntity.get("normalizedFullName"), assumingTextInConsecutiveToken)
);
orClausesArr[i] = query;
}
criteriaQuery = criteriaQuery
.select(rootEntity) // you can also select only the display columns and ignore the normalized/search columns
.where(criteriaBuilder.and(orClausesArr))
.orderBy(
criteriaBuilder.desc(rootEntity.get("customerUpdated")),
criteriaBuilder.desc(rootEntity.get("customerCreated"))
);
try {
return entityManager
.createQuery(criteriaQuery)
.setMaxResults(50)
.getResultList();
} catch (NoResultException nre) {
return new ArrayList<>();
}
}