I'm wondering is there any method to check a Chinese character is simplified Chinese or traditional Chinese in Python 3?
Asked
Active
Viewed 2,847 times
5
-
http://cjklib.org/0.3/library/cjklib.characterlookup.html seems to hold some promise but I'm not competent to write a useful answer from that. – tripleee Sep 12 '15 at 18:16
-
related: [What's the complete range for Chinese characters in Unicode?](http://stackoverflow.com/q/1366068/4279) – jfs Sep 13 '15 at 16:14
2 Answers
6
cjklib
does not support Python 3. In Python 3, you can use hanzidentifier
.
import hanzidentifier
print(hanzidentifier.has_chinese('Hello my name is John.'))
》 False
print(hanzidentifier.has_chinese('Country in Simplified: 国家. Country in Traditional: 國家.'))
》 True
print(hanzidentifier.is_simplified('John说:你好!'))
》 True
print(hanzidentifier.is_traditional('John說:你好!'))
》 True

Blckknght
- 100,903
- 11
- 120
- 169

Hong Zher Tan
- 61
- 1
- 3
1
You can use getCharacterVariants()
in cjklib
to query the character's simplified (S
) and traditional (T
) variants. As described in the Unihan database documentation, you can use this data to determine the classification for a character.

一二三
- 21,059
- 11
- 65
- 74