Please read the ASCII wikipedia page. You'll learn that ASCII has no "one half" character.
These days, most systems can be configured to use UTF-8 encoding (which is the "default" or at least the most commonly used encoding on the Web and on Unix systems).
UTF-8 is a variable length encoding for Unicode. So many characters or glyphs are represented by several bytes. For the ½
(officially the vulgar fraction one half unicode character) its UTF8 encoding is the two hex bytes 0Xc2 0xBD so in C notation \302\275
I am using the Linux Gnome Character Map utility gucharmap
to find all that.
You might be interested in UTF-32 (a fixed length encoding using 32 bits characters, in which ½ is represented by 0x000000BD), or even UTF-16 in which many characters are 16 bits (in particular ½
is 0x00BD e.g. one 16 bit character in UTF-16), but not all. You may also be interested in wide characters i.e. the wchar_t
of recent C or C++ standards (which is UTF-16 on Windows and UTF-32 on many Unix).
FWIW, Qt is using QChar as UTF-16 (Java also has UTF-16 char
...), but Gtk use UTF-8 i.e. variable length characters.
Notice that with variable length character encodings like UTF-8 getting the N-th character (which is not the N-th byte!) in a string requires to scan the string. Also, some byte combinations are not valid UTF-8.