TL;DR:
Half-width: Regular width characters.
Eg. 'A' and 'ニ'
Full-width: Chars that take two monospaced English chars' space on the display
Eg. '中', 'に' and 'A'
I need an implementation of this function:
/**
* @return Is this character a full-width character or not.
*/
fun Char.isFullWidth(): Boolean
{
// What is the most efficient implementation here?
}
No this is not about data structures for those chars, it's only about the displayed width.
Long Story:
I'm refactoring HyLogger, a logging library focused on text-coloring with gradients. Here is the problem I ran into:
If you look at the first gradient text block printed in the screenshot, the full-width text in the middle messed up the gradient pattern after it, because when calling string.length
, they are counted as one character even though they take up twice the size.
You might be asking, why on earth would anyone print full-width characters? This is a real problem because almost all characters in languages like Chinese, Japanese, or Korean are full-width, therefore takes twice the space, similar to the English full-width characters.
So I need a way to identify full-width characters so that I can calculate them as two gradient-pixels instead of one to solve the problem in the picture.
Known Info:
C++ check if unicode character is full width :
There is a list of East Asian Width characters on the Unicode website (and also the report), but it's probably not efficient to traverse this entire list for every single character when rendering a gradient text block.
Python has this Unicode database library, one possible solution is to call python API using Jython, which would be heavy and the efficiency is probably not very good.
Analyzing full width or half width character in Java :
- The ICU4J library has Unicode tools to achieve this function, but that library is 12.5 MB large, which isn't optimal for my 50 KB logger library.