Check if string contains CJK (chinese) characters

Question

I need to check if a string contains chinese characters. After searching i found that i have to look with the regex on this pattern \u31C0-\u31EF, But i don't manage to get the regex work.

Anyone experienced with this situation ? is the regex correct ?

Using `"[\u31C0-\u31EF]"` will indeed match any character whose code point is in the range `0x31C0` to `0x31EF`. You need the square brackets. I have no idea whether the actual numbers are correct; there are only 48 characters in this range, and I thought CJK had a lot more than that, but what do I know? — ajb, Feb 26 '14 at 17:29
There's definitely more characters in CJK, see [here](http://en.wikipedia.org/wiki/CJK_Unified_Ideographs). — juan.facorro, Feb 26 '14 at 17:37
The duplicate is not marked with a java tag. Is this really a duplicate? — Suragch, Jan 31 '17 at 10:48

score 2 · Accepted Answer · edited May 23 '17 at 11:45

2

As discussed here, in Java 7 (i.e. regex compiler meets requirement RL1.2 Properties from UTS#18 Unicode Regular Expressions), you can use the following regex to match a Chinese (well, CJK) character:

\p{script=Han}

which can be appreviated to simply

\p{Han}

edited May 23 '17 at 11:45

Community

1
1

answered Feb 26 '14 at 17:34

herohuyongtao

49,413
29
133
174

Check if string contains CJK (chinese) characters

1 Answers1

Linked