I doing some export to ADAM and I would like to know what is the max character I can send to Unicode string?
Asked
Active
Viewed 1,605 times
1 Answers
0
The current unicode standard sets the number of code-points at 1,114,112 (i.e. 10FFFF); a far smaller number (109,384 in v6 of the standard) are currently used for characters, excluding the user definable area. If you're actually after the maximum number of bytes in a character, that will depend on the representation you are using. E.g. UTF-8 would be between 1 and 4 bytes for a valid code-point.

borrible
- 17,120
- 7
- 53
- 75
-
1Both UTF-8 and UTF-16 can use up to four bytes for a character. Don't confuse code *units* with code *points*. – Joey Dec 14 '11 at 13:23