I'm looking at formatting a utf8 free text string to fit an exact column width on a terminal. I'm coding various truncation methods (left/middle/right) for long strings however, when the truncation break point lies over a wide character, such as an emoji, the display column counting falls apart. some form of padding is needed for the 'half wide' column placement.
Is there a suitable narrow character to show that indicates we do have valid unicode character, but insufficient display space to show it, as opposed to the special replacement character � usually used for invalid unicode ??
Example: on a fixed spacing terminal fit two smiley emojis into the space that would fit 'aaa'. e.g. "" ! so need a, preferably standardised, substitute character for the second emoji/wide character, e.g. "⋮" to fit that three wide space.
A side issue is trying to work out when decomposed composite characters start and end, (also are there combining prefixes?). It looks like the next code point needs to be read to see if it is still zero width (e.g. 'o' U+006F, then 'umlaut' U+0308, rather than ö
U+00F6; don't stop after the plain 'o').