The regular expression \p{N}
is not recognising Chinese numerals.
Please suggest correct regex in Java for this.
The regular expression \p{N}
is not recognising Chinese numerals.
Please suggest correct regex in Java for this.
My answer is based on this article on Wikipedia Chinese numerals:
Common numerals: 0 to 10, hundred, thousand, ten thousand, hundred million
零〇一二三四五六七八九十百千: \u96f6\u3007\u4e00\u4e8c\u4e09\u56db\u4e94\u516d\u4e03\u516b\u4e5d\u5341\u767e\u5343
(Simplified) 万亿: \u4e07\u4ebf
(Traditional) 萬億: \u842c\u5104
Financial use
(Simplified) 零壹贰叁肆伍陆柒捌玖拾佰仟萬億: \u96f6\u58f9\u8d30\u53c1\u8086\u4f0d\u9646\u67d2\u634c\u7396\u62fe\u4f70\u4edf\u842c\u5104
(Traditional) 零壹貳參肆伍陸柒捌玖拾佰仟萬億: \u96f6\u58f9\u8cb3\u53c3\u8086\u4f0d\u9678\u67d2\u634c\u7396\u62fe\u4f70\u4edf\u842c\u5104
The 2 versions differs at 2, 3, and 6. Some of them overlap with common numerals.
Large number beyond 1012 and up to 1044
(Traditional) 兆京垓秭穰溝澗正載: \u5146\u4eac\u5793\u79ed\u7a70\u6e9d\u6f97\u6b63\u8f09
(Simplified) 兆京垓秭穰沟涧正载: \u5146\u4eac\u5793\u79ed\u7a70\u6c9f\u6da7\u6b63\u8f7d
The 2 versions differs at the 6th, 7th and 9th characters.
(Some other alternatives) 經经杼壤: \u7d93\u7ecf\u677c\u58e4
Regional usage
(Traditional) 兩: \u5169
(Simplified) 两: \u4e24
Of note is the character above. Others are not commonly used.