0

I am dealing with utf8 string using c++. I've found that certain language characters have a coding scope in utf-8, such as the scope of Chinese charaters is u4E00 - u9FFF.

But how can I find if a given utf-8 string contains Chinese using C++?

Mike Kinghan
  • 55,740
  • 12
  • 153
  • 182
Yang
  • 754
  • 2
  • 8
  • 22
  • 2
    You can't, because characters and language are not necessarily strictly related. What language is a part of? Even the block you're talking about can be used natively in one of at least 4 different languages. – user657267 May 19 '16 at 06:06
  • are these characters 中, 十... Chinese, Japanese, Korean or Vietnamese? – phuclv May 19 '16 at 07:29
  • @LưuVĩnhPhúc it depends. Briefly speaking, since Japan is close to China, they have cultural cummunications in history, so Chinese has influenced some part of Japanese. If these words appear in Japan contex or alongside the words like some Japan only words like "サービスは", they are Japan. Otherwise they are Chinese. – Yang Jun 27 '16 at 06:40

0 Answers0