-1

I want to write a regex which is valid if it starts with IJ followed by any lowercase letters in utf-8.

private static String pattern = "^\\u0049\\u004A(\\p{Ll})*";
System.out.println(Pattern.compile(pattern).matcher("IJP").find()); // true

I am using this regex but it doesn't seem to work. For "IJP" it should not match as P is uppercase.

user1298426
  • 3,467
  • 15
  • 50
  • 96
  • All you missed is `$`. Use anchors to make sure you match the entire string regardless of the regex method you are using (`find()` or `matches()`). – Wiktor Stribiżew Sep 03 '20 at 20:14

2 Answers2

1

Your pattern should be:

final String pattern = "^\\u0049\\u004A\\p{Ll}*$";

Note placement of $ in the end to make it 0 or more lowercase characters before end. Note that I have removed unnecessary group around \p{Ll}.

Code Demo:

jshell> String pattern = "^\\u0049\\u004A\\p{Ll}*$";
pattern ==> "^\\u0049\\u004A\\p{Ll}*$"

jshell> Pattern.compile(pattern).matcher("IJP").find();
$6 ==> false
anubhava
  • 761,203
  • 64
  • 569
  • 643
0

\\u0049 is a rather obtuse way of writing I, don't you think? Why not just... write pattern = "^IJ\\p{Ll}"?

IJP does match though. Think about it. You're asking for an I, then a J, then 0 or more lowercase letters. Which is right there: An I, a J, and 0 lowercase letters.

Either use matches() instead of find() (which asks: Does the ENTIRE string match the regexp, vs. 'is there some substring that does') or, as you've already thrown the ^ in there, toss a $ at the end to match.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72