4

How to map À, Á, Â, Ã, Ä, Å to A for more efficient search?

I am writing an Android application that need to search a set of strings with those symbols on some character.

In order to make search results more powerful, I would like to map À, Á, Â, Ã, Ä, Å to just A. For example, if the user's query is "Test" the following string should matches with query: Tȅst, Tȇst, Teśt, etc.

Is there any possible way of doing this in Android with API level >= 8?

Pongpat
  • 13,248
  • 9
  • 38
  • 51

2 Answers2

3

Lucene does this kind of thing. Take a look at the org.apache.lucene.analysis.icu.ICUNormalizer2Filter for an approach to text normalization for search.

Aurand
  • 5,487
  • 1
  • 25
  • 35
1
String text = "Your SeÅrchable Text";
String searchMe = text.replaceAll("[ÀÁÂÃÄÅ]", "A");

I would just replace all of them in a searchable version of the main String. Pretty simple! If there are multiple cases (such as weird 'E' characters, just do another replaceAll:

searchMe = searchMe.replaceAll("[EEEEEE]", "E"); //(note: those are the weird Es in there)
Josh T
  • 564
  • 3
  • 12