C++ What does the size of char16_t depend on?

Question

This is also related to char32_t and any intXX_t. The specification points out that:

2.14.3.2:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

5.3.3.1:

[..] in particular [..] sizeof(char16_t), sizeof(char32_t), and sizeof(wchar_t) are implementation-defined

I can not see anything about the intXX_t types, apart from the comment that they are "optional" (18.4.1).

If a char16_t isn`t guaranteed to be 2 bytes, is it guaranteed to be 16 bit (even on architectures where 1 byte != 8 bit)?

I read that as 'char16_t must be at least 16 bits' but I'm not a standard-lawyer. — Rup, Jun 22 '11 at 13:47
Why do you even care about `sizeof(char16_t)` ? What matters is that it can hold 16-bit values. — sellibitze, Jun 22 '11 at 13:52

Martin York · Accepted Answer · 2011-06-22T14:38:22.980

3.9.1 Fundamental types [basic.fundamental]

Types char16_t and char32_t denote distinct types with the same size, signedness, and alignment as uint_least16_t and uint_least32_t, respectively, in , called the underlying types.

This means char16_t is at least 16 bits (but may be larger)

But I also believe:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

provides the same guarantees (though less explicitly (as you have to know that ISO 10646 is UCS (Note UCS is compatible but not exactly the same as Unicode))).

Upvoted. I was about to quote the same paragraph. You beat me to it. :) — sellibitze, Jun 22 '11 at 13:57

Fred Foo · Answer 2 · 2011-06-22T13:55:30.700

5

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

This is impossible to satisfy if char16_t isn't at least 16 bits wide, so by contradiction, it's guaranteed to be at least that wide.

edited Jun 22 '11 at 13:55

answered Jun 22 '11 at 13:48

Fred Foo

355,277
75
744
836

one may think so, but what would be the difference between (e.g.) `std::int16_t` and `std::int_least16_t`? – 0xbadf00d Jun 22 '11 at 13:52
@FrEEzE2046: `int16_t` is an optional typedef while `int_least16_t` is a mandatory one. If `` provides `int16_t` it is guaranteed to be an "exact-16-bit" type with 2's complement encoding for negative numbers. `int_least16_t` may be larger than 16 bits. Consider a 32bit machine that does not support 8bit or 16bit arithmetic. – sellibitze Jun 22 '11 at 13:54
@FrEEzE2046: `int16_t` is far dense packing of 16-bit values. `int_least16_t` is for fast packing. `char16_t` is for UTF-16 strings. – Fred Foo Jun 22 '11 at 13:56
no, int_fast16_t would be for "fast". int_least16_t is the smallest integer type that is as least 16 bits wide (so, 16 bit if a 16 bit type exists, otherwise the next biggest). – etarion Jun 22 '11 at 14:04

score 2 · Answer 3 · answered Jun 22 '11 at 13:54

It can't be guaranteed to be exactly 16 bits, since there are platforms which don't support types that small (for example, DSPs often can't address anything smaller than their word size, which may be 24, 32 or 64 bits). Your first quote guarantees that it will be at least 16 bits.

C++ What does the size of char16_t depend on?

3 Answers3

3.9.1 Fundamental types [basic.fundamental]