2

I want to match two strings in java eg.

text: János

searchExpression: Janos

Since I don't want to replace all special characters, I thought I could just make the á a wildcard, so everything would match for this character. For instance if I search in János with Jxnos, it should find it. Of course there could be multiple special characters in the text. Does anyone have an idea how I could achieve this via any pattern matcher, or do I have to compare char by char?

reto
  • 9,995
  • 5
  • 53
  • 52
user3172567
  • 461
  • 1
  • 7
  • 19
  • 3
    Please read about a possible approach based on normalizing your string eg http://stackoverflow.com/questions/3322152/is-there-a-way-to-get-rid-of-accents-and-convert-a-whole-string-to-regular-lette - this approach would make it very easy to remove the special chars – reto Jan 22 '15 at 06:45
  • Thanks for that information, I didn't think about using the apache.commons library. The stripAccents does exactly what I need. If you would write your comment as anwer, I could accept it. – user3172567 Jan 22 '15 at 07:14
  • done - glad this worked for you – reto Jan 22 '15 at 08:09

2 Answers2

2

use pattern and matcher classes with J\\Snos as regex. \\S matches any non-space character.

String str = "foo János bar Jxnos";
Matcher m = Pattern.compile("J\\Snos").matcher(str);
while(m.find())
{
    System.out.println(m.group());
}

Output:

János
Jxnos
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
1

A possible solution would be to strip the accent with the help of Apache Commons StringUtils.stripAccents(input) method:

String input = StringUtils.stripAccents("János");
System.out.println(input); //Janos

Make sure to also read upon the more elaborate approaches based on the Normalizer class: Is there a way to get rid of accents and convert a whole string to regular letters?

Community
  • 1
  • 1
reto
  • 9,995
  • 5
  • 53
  • 52