1

I understand that you can use Pattern.quote to escape characters within a string that is reserved by regex. But I do not understand why the following is not working:

String s="and this)";
String ps = "\\b("+Pattern.quote(s)+")\\b";
//String pp = Pattern.quote(pat);
Pattern p=Pattern.compile(ps);
Matcher mm = p.matcher("oh and this) is");

System.out.println(mm.find()); //print false, but expecting true?

When String s= "and this) is changed to String s="and this, i.e., no ), it works. How should I change the code so with ")" it also works as expected?

Thanks

Iamat8
  • 3,888
  • 9
  • 25
  • 35
Ziqi
  • 2,445
  • 5
  • 38
  • 65
  • 1
    Possible duplicate of [How to escape text for regular expression in Java](http://stackoverflow.com/questions/60160/how-to-escape-text-for-regular-expression-in-java) – Thomas Weller Oct 01 '15 at 13:55
  • `\b` matches between `\W` and `\w`. It doesn't match between `)` and ``. – Phylogenesis Oct 01 '15 at 13:57

1 Answers1

2

Use negative look-arounds to check for non-word characters before and after the keyword:

String ps = "(?<!\\w)"+Pattern.quote(s)+"(?!\\w)";

This way you will still match the s as a whole word and it won't be a problem is the keyword has non-word characters at the beginning or end.

IDEONE demo:

String s="and this)";
String ps = "(?<!\\w)"+Pattern.quote(s)+"(?!\\w)";
Pattern p=Pattern.compile(ps);
Matcher mm = p.matcher("oh and this) is");
System.out.println(mm.find()); 

Result: true

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • `\W` is the negative of `\w`. You can just use that over the negative lookarounds. – Phylogenesis Oct 01 '15 at 14:03
  • 1
    @anubhava: Thank you and merry coming holidays! – Wiktor Stribiżew Oct 01 '15 at 14:29
  • 1
    Since @Phylogenesis suggested about using positive lookahead using `\W` I should tell that your current regex using negative lookahead is better approach [as you can see in this demo](https://regex101.com/r/fR1pY9/1) . Otherwise using positive lookaheads it should be [`(?<=\W|^)and this\)(?=\W|$)`](https://regex101.com/r/fR1pY9/2) – anubhava Oct 01 '15 at 14:35