1

I'm working on kind of search in my app and need to check if my text contains the word that the user entered.

currently my code is:

if (textToCheck.toLowerCase().contains(input.toLowerCase())) {

Which works perfectly fine. But , i would like my search to be smarter , and search only on Alphabets so if for example a text contains other symbols (rather than a-z) the app will ignore them :

Example : if this is my text and this is my input , i want a match.

textToCheck = "The search will not work with, comma, brackets, or any other symbol"

input = "work with comma brackets"

james
  • 221
  • 1
  • 5
  • 14
  • the regular expression remove also the spaces @nhahtdh? – Blackbelt Oct 17 '14 at 10:32
  • Does your text only ever contains non accentuated letters? – fge Oct 17 '14 at 10:36
  • is that you are going to get space before and after say special character i.e. ',' in your above example as even with replaceAll("[^A-Za-z ]*", "") it's not going to work? – SMA Oct 17 '14 at 10:36
  • my fault, updated the example, the text will contain only one space probably after the symbol – james Oct 17 '14 at 10:49
  • I fail to understand your input vs desired output here. Are you trying to match text that only contains alphabetic characters? Are you trying to only match alphabetic characters - or words containing only alphabetic characters - in an input? Not very clear to me. – Mena Oct 17 '14 at 11:43

2 Answers2

0

You could use

/**
 * @param s
 * @return s without accented characters
 * @see http://stackoverflow.com/questions/15190656/easy-way-to-remove-utf-8-accents-from-a-string
 */
private static String stripAccents(String s) {
    String s1 = Normalizer.normalize(s, Normalizer.Form.NFD);
    s1 = s1.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "");
    return s1;
}

private static String onlyAlphabet (String string){
    return stripAccents(string)
    // deletes anything not letter or space
    .replaceAll("[^A-Za-z\\s]", "")
    // converts chains of blanks in single spaces
    .replaceAll("\\s{2,}", " ")
    // gets lower case
    .toLowerCase();
}

public static boolean find(String textToBeSearched, String input){
    String normalizedWhole = onlyAlphabet(textToBeSearched);
    String normalizedInput = onlyAlphabet(input);
    return normalizedWhole.contains(normalizedInput);
}

If you are not going to get accentuated characters, you can skip the stripAccents method.

The method onlyAlphabet takes care of spaces by reducing several spaces to only one, that way, if user writes something like "comma , brackets", it'd be the same as if he would have written "comma brackets"

dtortola
  • 768
  • 4
  • 6
  • It works but every search takes something like 24 seconds to complete (before it took only 1.5-2) p.s I don't use accentuated characters – james Oct 17 '14 at 12:01
0

You can achieve it by using regex. If your app search only on Alphabets then use following if condition it will work. The following code replace all special characters with blank space.

if (textToCheck.toLowerCase().contains(input.replaceAll("[^a-zA-Z ]", "").toLowerCase())) {
Karthik
  • 76
  • 7