2

I'm using this pattern to check if a string starts with at least 2 alphabetic characters in front a colon:

string.matches("^\\p{IsAlphabetic}{2,}:")

but I get the following exception thrown at me:

java.util.regex.PatternSyntaxException: Unknown character property name {Alphabetic} near index 16
    ^\p{IsAlphabetic}{2,}:
    ^
    at java.util.regex.Pattern.error(Pattern.java:1730)
    at java.util.regex.Pattern.charPropertyNodeFor(Pattern.java:2454)
    at java.util.regex.Pattern.family(Pattern.java:2429)
    at java.util.regex.Pattern.sequence(Pattern.java:1848)
    at java.util.regex.Pattern.expr(Pattern.java:1769)
    at java.util.regex.Pattern.compile(Pattern.java:1477)
    at java.util.regex.Pattern.<init>(Pattern.java:1150)
    at java.util.regex.Pattern.compile(Pattern.java:840)
    at java.util.regex.Pattern.matches(Pattern.java:945)
    at java.lang.String.matches(String.java:2102)

even though the specification of the Pattern classes states:

Binary properties are specified with the prefix Is, as in IsAlphabetic. The supported binary properties by Pattern are

  • Alphabetic
  • Ideographic
  • Letter
  • ...

and the section Classes for Unicode scripts, blocks, categories and binary properties lists

\p{IsAlphabetic} An alphabetic character (binary property)

Chetan Kinger
  • 15,069
  • 6
  • 45
  • 82
freedio
  • 33
  • 1
  • 4
  • it works for me. Note that matches method tries to match the whole string. – Avinash Raj Apr 18 '15 at 14:35
  • It doesn't matter if I enclose `\\p{IsAlphabetic}` in a character class with `[]`. – freedio Apr 18 '15 at 14:36
  • @Avinash: what Java version and OS are you on? – freedio Apr 18 '15 at 14:37
  • What Java version are you using? The pattern works in 1.8. – laune Apr 18 '15 at 14:37
  • Your pattern works correctly in this online tester: http://www.regexplanet.com/advanced/java/index.html. – John Bollinger Apr 18 '15 at 14:39
  • The error message shows "Alphabetic" - did you post code from another location in your program - not the place where the error occurs? – laune Apr 18 '15 at 14:41
  • I'm running Java version 1.8.0_40 (oracle-jdk-bin-1.8.0.40 on 64 bit Gentoo Linux). > Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode) – freedio Apr 18 '15 at 14:43
  • Contrary to @laune's comment on a now-deleted answer, the [Java 7 API docs](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#ubpc) seem to say that `\p{Alpha}` *is* equivalent to `\p{IsAlphabetic}` when the `UNICODE_CHARACTER_CLASS` flag is specified – John Bollinger Apr 18 '15 at 14:44
  • @laune: I think, the character property itself is called Alphabetic, but in order to use it as a binary property, you have to specify the Is…. But in the exception, it uses the character property name again. – freedio Apr 18 '15 at 14:46
  • Is your project configured to work with Java 8? Is it able to compile something like `Predicate empty = String::isEmpty;`? – Pshemo Apr 18 '15 at 14:46
  • @JohnBollinger Try matching String s = "äö:"; using IsAlphabetic and IsAlpha and Alpha. – laune Apr 18 '15 at 14:47
  • @freedio Yup, this is confusing. So some old Java version seems to be the only explanation? – laune Apr 18 '15 at 14:49
  • @all: Thanks a lot for all the comments!! `\\p{Alpha}` and `\\p{IsAlpha}` both work, so I stick with one of these, since they seem to be equivalent. – freedio Apr 18 '15 at 14:50
  • OK, Java 7 doesn't have IsAlphanumeric yet. fails with the error as reported. – laune Apr 18 '15 at 14:51
  • @Pshemo: Good point! I'm working in eclipse, and in fact, my default compiler compliance level is set to 1.6 (the reason being that I'm also developing for legacy Android projects) — so the JVM I'm running on is actually irrelevant. But since I've found a viable solution with `\\p{Alpha}` I won't go through the hassle of reconfiguring and recompiling my 20+ projects. – freedio Apr 18 '15 at 14:56

1 Answers1

3

Works and returns true using java 1.8.

String s = "äö:";
System.out.println(s.matches("^\\p{IsAlphanumeric}{2,}:"));

Note that the forms available in Java 1.7 - Alpha, IsAlpha - do not necessarily include characters not in US-ASCII . This returns false:

String s = "äö:";
System.out.println(s.matches("^\\p{IsAlpha}{2,}:"));

But note that this works in 1.7 and returns true:

String s = "äö:";
Pattern pat = Pattern.compile( "^\\p{Alpha}{2,}:",
                     Pattern.UNICODE_CHARACTER_CLASS );
Matcher mat = pat.matcher( s );
System.out.println(mat.matches());
laune
  • 31,114
  • 3
  • 29
  • 42
  • 1
    The form with `\p{Alpha}` (but not the one with `\p{IsAlpha}`) *does* work if the `UNICODE_CHARACTER_CLASS` flag is specified for the pattern, as documented. Tested on Java 8 against the input given in this answer. – John Bollinger Apr 18 '15 at 15:02