Questions tagged [cjk]

CJK stands for Chinese, Japanese and Korean and is used to label issues common to these East Asian languages and their large character repertoires.

CJK stands for Chinese, Japanese, and Korean: East-Asian languages covered by various character sets, including:

  • Big5
  • EUC-JP
  • EUC-KR
  • Shift-JIS
  • GB2312
  • GB18030
  • ISO 2022-JP
  • Unicode
1096 questions
219
votes
3 answers

How does Chrome decide what to highlight when you double-click Japanese text?

If you double-click English text in Chrome, the whitespace-delimited word you clicked on is highlighted. This is not surprising. However, the other day I was clicking while reading some text in Japanese and noticed that some words were highlighted…
polm23
  • 14,456
  • 7
  • 35
  • 59
124
votes
3 answers

What are the most common non-BMP Unicode characters in actual use?

In your experience which Unicode characters, codepoints, ranges outside the BMP (Basic Multilingual Plane) are the most common so far? These are the ones which require 4 bytes in UTF-8 or surrogates in UTF-16. I would've expected the answer to be…
hippietrail
  • 15,848
  • 18
  • 99
  • 158
119
votes
8 answers

What's the complete range for Chinese characters in Unicode?

Unicode allocated U+4E00..U+9FFF for Chinese characters. This is part of the complete set, but not all.
omg
  • 136,412
  • 142
  • 288
  • 348
98
votes
4 answers

Language codes for simplified Chinese and traditional Chinese?

We are creating multi-language subsites on our website. I would like to use the 2-letter language codes. Spanish and French are easy. They will get URLs like: mydomain.com/es mydomain.com/fr but I run into a problem with Traditional and…
jeph perro
  • 6,242
  • 26
  • 90
  • 124
94
votes
5 answers

Java regex for support Unicode?

To match A to Z, we will use regex: [A-Za-z] How to allow regex to match utf8 characters entered by user? For example Chinese words like 环保部
cometta
  • 35,071
  • 77
  • 215
  • 324
54
votes
6 answers

Convert or extract TTC font to TTF - how to?

I am already more than 8 hours trying to make the STHeiti Medium.ttc.zip font work on Windows. But I can't make it work. Is anybody able to make it work on Windows? If yes, please share the steps how to do it.
Pikk
  • 2,343
  • 6
  • 25
  • 41
47
votes
4 answers

Detect Windows font size (100%, 125%, and 150%)

I created an application that works perfectly until the user selects 125% or 150%. It would break my application. I later found a way to find the font size by detecting the DPI. This was working great until people with Chinese versions of Windows 7…
Landin Martens
  • 3,283
  • 12
  • 43
  • 61
46
votes
2 answers

Flutter fetched Japanese character from server decoded wrong

I am building a mobile app with Flutter. I need to fetch a json file from server which includes Japanese text. A part of the returned json is: { "id": "egsPu39L5bLhx3m21t1n", "userId": "MCetEAeZviyYn5IMYjnp", "userName": "巽 裕亮", …
Tran Triet
  • 1,257
  • 2
  • 16
  • 34
32
votes
3 answers

What is the encoding of Chinese characters on Wikipedia?

I was looking at the encoding of Chinese characters on Wikipedia and I'm having trouble figuring out what they are using. For instance "的" is encoded as "%E7%9A%84" (see here). That's three bytes, however none of the encodings described on this page…
laurent
  • 88,262
  • 77
  • 290
  • 428
31
votes
4 answers

Php - regular expression to check if the string has chinese chars

I have the string $str and I want to check if it`s content has Chinese chars or not (true/false) $str = "赕就可消垻,只有当所有方块都被消垻时才可以过关"; can you please help me? Thanks! Adrian
Adrian
  • 997
  • 2
  • 9
  • 13
29
votes
4 answers

Is there any good open-source or freely available Chinese segmentation algorithm available?

As phrased in the question, I'm looking for a free and/or open-source text-segmentation algorithm for Chinese, I do understand it is a very difficult task to solve, as there are many ambiguities involed. I know there's google's API, but well it is…
Sebastian
  • 6,293
  • 6
  • 34
  • 47
28
votes
2 answers

Find all Chinese text in a string using Python and Regex

I needed to strip the Chinese out of a bunch of strings today and was looking for a simple Python regex. Any suggestions?
prairiedogg
  • 6,323
  • 8
  • 44
  • 52
27
votes
3 answers

Japanese characters looking like Chinese on Android

PREAMBLE: since API 17 (Android 4.2), there's a method TextView.setTextLocale() that explicitly solves this problem for TextViews and derived classes. Assign a Japanese locale (Locale.JAPAN), and Unihan characters will look Japanese. I have an…
Seva Alekseyev
  • 59,826
  • 25
  • 160
  • 281
27
votes
2 answers

SQL Server database field to handle korean and chinese characters

Is it possible to have a field in SQL Server that can store Chinese, Korean and European characters? My Chinese characters just become ????? The datatype is NVARCHAR as well.
Alessandro
  • 305
  • 2
  • 8
  • 22
25
votes
4 answers

Testing Android Market in-app billing with dummy credit card credentials

I have configured an Android application to use the in-app billing module as documented at: http://developer.android.com/guide/market/billing/index.html It works fine when tested using the UK development team's accounts which have real credit cards…
Kaiesh
  • 1,042
  • 2
  • 14
  • 21
1
2 3
73 74