3

This question is based on this question.

I am using \P{M}\p{M}* in order to match all letters (both from German and French language).

I chose this regex in order to avoid defining every unicode character such as: ^[a-zA-Z[\\u00c0-\\u01ff]]+[\\']?(([-]?[a-zA-Z[\\u00c0-\\u01ff]]*[\\s]?)|([\\s]?[a-zA-Z[\\u00c0-\\u01ff]]*[-]?)){1,2}[a-zA-Z[\\u00c0-\\u01ff]]+$

However, despite using the unicode format defined in the previous question, characters such as ß or è are not matched by the regex.

I am using JDK 6.

What am I missing. Thanks!

Community
  • 1
  • 1
Ionut
  • 2,788
  • 6
  • 29
  • 46

2 Answers2

3

Use the posix character class \p{L} for "any letter":

System.out.println("abcßè".matches("\\p{L}+")); // true
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • I'm receiving this exception: `java.util.regex.PatternSyntaxException: Unknown character property name {Latin} near index 10 \p{IsLatin}+`. I'm guessing that there is something wrong with the format but I have no idea what. – Ionut Feb 07 '14 at 14:33
  • Did you perhaps code `"\\p{Latin}"` instead of `"\\p{IsLatin}"`? Anyway, I just realised you can simply use `"\\p{L}"` for "any letter" - see updated answer. – Bohemian Feb 07 '14 at 14:39
  • It is `IsLatin`. For insctance you can test it here: http://java-regex-tester.appspot.com/ and you'll get the same error – Ionut Feb 07 '14 at 14:45
  • That site doesn't understand `\p{IsLatin}` but **does** understand `\p{L}` (maybe JDK version thing). Have you tried `"\\p{L}"` like my updated answer? – Bohemian Feb 07 '14 at 14:55
0

using java 6 this code

 public static void main(String[] args) {
       String str = "hello ß you";
       Pattern p = Pattern.compile("(:?\\P{M}\\p{M}*)+");
       Matcher matcher = p.matcher(str);
       System.out.println("replaced: '" + matcher.replaceAll("") + "'");
}

returns: replaced: ''

The 'ß' is matched

Antoine Wils
  • 349
  • 2
  • 4
  • 21
  • Hi! Thanks but it seems that this pattern also matches digits and other special chars like `_` or space. – Ionut Feb 07 '14 at 14:33