How many bytes are required to store one character in:
- Microsoft's implementation of the .NET framework, version 4
- JavaScript, as implemented by Microsoft Internet Explorer 8?
How many bytes are required to store one character in:
.net and JavaScript both are UTF-16:
Represents each Unicode code point as a sequence of one or two 16-bit integers. Most common Unicode characters require only one UTF-16 code point, although Unicode supplementary characters (U+10000 and greater) require two UTF-16 surrogate code points. Both little-endian and big-endian byte orders are supported.
So it can be 16bit or 32 bit.
Both .NET and JavaScript use UTF-16. UTF-16 is a so-called variable-length encoding which uses 16-bit code units to represent Unicode code points (which are 21 bits in length). Historically it came from UCS-2 when Unicode was still a 16-bit code (which was deemed insufficient later, thus the expansion to 21 bits).
Since UTF-16 uses 16-bit code units the code itself is a 16-bit code, but to represent a character, you'll have to look a bit closer to what you actually mean:
Character in the Unicode sense means Unicode code point which is probably your intended meaning. Here are two cases:
Character in the usual meaning often refers to graphemes, actually, which would be what we perceive as a single character. Those can have arbitrarily many diacritics, or may be ligatures that are formed out of multiple code points by the rendering engine. Long story short in this case: Those can be arbitrarily long since they can consist of several code points.