12

This is also related to char32_t and any intXX_t. The specification points out that:

2.14.3.2:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

5.3.3.1:

[..] in particular [..] sizeof(char16_t), sizeof(char32_t), and sizeof(wchar_t) are implementation-defined

I can not see anything about the intXX_t types, apart from the comment that they are "optional" (18.4.1).

If a char16_t isn`t guaranteed to be 2 bytes, is it guaranteed to be 16 bit (even on architectures where 1 byte != 8 bit)?

Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
0xbadf00d
  • 17,405
  • 15
  • 67
  • 107

3 Answers3

12

3.9.1 Fundamental types [basic.fundamental]

Types char16_t and char32_t denote distinct types with the same size, signedness, and alignment as uint_least16_t and uint_least32_t, respectively, in , called the underlying types.

This means char16_t is at least 16 bits (but may be larger)

But I also believe:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

provides the same guarantees (though less explicitly (as you have to know that ISO 10646 is UCS (Note UCS is compatible but not exactly the same as Unicode))).

Martin York
  • 257,169
  • 86
  • 333
  • 562
5

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

This is impossible to satisfy if char16_t isn't at least 16 bits wide, so by contradiction, it's guaranteed to be at least that wide.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • one may think so, but what would be the difference between (e.g.) `std::int16_t` and `std::int_least16_t`? – 0xbadf00d Jun 22 '11 at 13:52
  • @FrEEzE2046: `int16_t` is an optional typedef while `int_least16_t` is a mandatory one. If `` provides `int16_t` it is guaranteed to be an "exact-16-bit" type with 2's complement encoding for negative numbers. `int_least16_t` may be larger than 16 bits. Consider a 32bit machine that does not support 8bit or 16bit arithmetic. – sellibitze Jun 22 '11 at 13:54
  • @FrEEzE2046: `int16_t` is far dense packing of 16-bit values. `int_least16_t` is for fast packing. `char16_t` is for UTF-16 strings. – Fred Foo Jun 22 '11 at 13:56
  • no, int_fast16_t would be for "fast". int_least16_t is the smallest integer type that is as least 16 bits wide (so, 16 bit if a 16 bit type exists, otherwise the next biggest). – etarion Jun 22 '11 at 14:04
2

It can't be guaranteed to be exactly 16 bits, since there are platforms which don't support types that small (for example, DSPs often can't address anything smaller than their word size, which may be 24, 32 or 64 bits). Your first quote guarantees that it will be at least 16 bits.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644