64

Given this C++11 program, should I expect to see a number or a letter? Or not make expectations?

#include <cstdint>
#include <iostream>

int main()
{
    int8_t i = 65;
    std::cout << i;
}

Does the standard specify whether this type can or will be a character type?

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
  • int according to specifcations must at least 16bits – stdcall Apr 09 '13 at 20:24
  • 2
    `uint8_t` is an integer type, not a character type. I expect numbers, not letters. It looks like another C++ committee faux pas (GCC 6.3.1-1 prints them as characters). The committee got it partially right with `std::byte`. `std::byte` does not print as a character type (at the moment, it does not print at all. Hopefully that will be fixed in the future). – jww Jul 24 '17 at 21:55
  • 1
    `uint8_t` is an integer type, @jww, for sure. The problem is just that all character types (and `bool`) are integer types, too, for better and for worse. (The worse, in this case, being that compilers aren't smart enough to track type aliases to determine intended use cases.) – Justin Time - Reinstate Monica Aug 01 '23 at 21:37

5 Answers5

28

From § 18.4.1 [cstdint.syn] of the C++0x FDIS (N3290), int8_t is an optional typedef that is specified as follows:

namespace std {
  typedef signed integer type int8_t;  // optional
  //...
} // namespace std

§ 3.9.1 [basic.fundamental] states:

There are five standard signed integer types: “signed char”, “short int”, “int”, “long int”, and “long long int”. In this list, each type provides at least as much storage as those preceding it in the list. There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types.

...

Types bool, char, char16_t, char32_t, wchar_t, and the signed and unsigned integer types are collectively called integral types. A synonym for integral type is integer type.

§ 3.9.1 also states:

In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

It is tempting to conclude that int8_t may be a typedef of char provided char objects take on signed values; however, this is not the case as char is not among the list of signed integer types (standard and possibly extended signed integer types). See also Stephan T. Lavavej's comments on std::make_unsigned and std::make_signed.

Therefore, either int8_t is a typedef of signed char or it is an extended signed integer type whose objects occupy exactly 8 bits of storage.

To answer your question, though, you should not make assumptions. Because functions of both forms x.operator<<(y) and operator<<(x,y) have been defined, § 13.5.3 [over.binary] says that we refer to § 13.3.1.2 [over.match.oper] to determine the interpretation of std::cout << i. § 13.3.1.2 in turn says that the implementation selects from the set of candidate functions according to § 13.3.2 and § 13.3.3. We then look to § 13.3.3.2 [over.ics.rank] to determine that:

  • The template<class traits> basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, signed char) template would be called if int8_t is an Exact Match for signed char (i.e. a typedef of signed char).
  • Otherwise, the int8_t would be promoted to int and the basic_ostream<charT,traits>& operator<<(int n) member function would be called.

In the case of std::cout << u for u a uint8_t object:

  • The template<class traits> basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, unsigned char) template would be called if uint8_t is an Exact Match for unsigned char.
  • Otherwise, since int can represent all uint8_t values, the uint8_t would be promoted to int and the basic_ostream<charT,traits>& operator<<(int n) member function would be called.

If you always want to print a character, the safest and most clear option is:

std::cout << static_cast<signed char>(i);

And if you always want to print a number:

std::cout << static_cast<int>(i);
Community
  • 1
  • 1
Daniel Trebbien
  • 38,421
  • 18
  • 121
  • 193
  • "the Standard allows for `typedef char int8_t`": I believe, this is not true because `char` is an integer type but it's not a signed integer type even if it has a sign. See my post for a (hopefully correct) explanation on this (rather confusing) terminology. – Cassio Neri Apr 22 '13 at 17:32
  • @CassioNeri: The C++ Standard cannot include `char` in the list of *signed integer types* or *unsigned integer types* because the Standard allows `char` objects to either take on signed or unsigned values. So, I do not agree with your viewpoint that just because `char` is not listed in the list of *signed integer types*, this means that a `char` is not a *signed integer type* even if it takes on signed values because the Standard **can't** include `char` in either list of *signed integer types* or *unsigned integer types*. – Daniel Trebbien Apr 22 '13 at 22:52
  • 3
    Although your reasoning makes sense to me, I still believe in what I said. Apparently Stephan T. Lavavej [agrees with me](http://connect.microsoft.com/VisualStudio/feedback/details/764409/visual-studio-2012-c-std-make-unsigned): "While "char" is required to have the same signedness and range as either "signed char" or "unsigned char" (which one is implementation-defined), "char" is neither a signed integer type nor an unsigned integer type". See also [Johannes Schaub - litb](http://stackoverflow.com/users/34509/johannes-schaub-litb)'s comment [here](http://stackoverflow.com/q/9285657/1137388) – Cassio Neri Apr 23 '13 at 10:11
  • 1
    @CassioNeri: I now think that you are right. Thanks for finding those two arguments. Since everything that Stephan T. Lavavej wrote makes sense to me, I would think that `std::make_signed::type` would have to be identically `int8_t` because `int8_t` is specified as a *signed integer type*. Therefore, `int8_t` cannot be a `typedef` of `char` even if `char` objects take on signed values. – Daniel Trebbien Apr 23 '13 at 22:40
23

int8_t is exactly 8 bits wide (if it exists).

The only predefined integer types that can be 8 bits are char, unsigned char, and signed char. Both short and unsigned short are required to be at least 16 bits.

So int8_t must be a typedef for either signed char or plain char (the latter if plain char is signed).

If you want to print an int8_t value as an integer rather than as a character, you can explicitly convert it to int.

In principle, a C++ compiler could define an 8-bit extended integer type (perhaps called something like __int8), and make int8_t a typedef for it. The only reason I can think of to do so would be to avoid making int8_t a character type. I don't know of any C++ compilers that have actually done this.

Both int8_t and extended integer types were introduced in C99. For C, there's no particular reason to define an 8-bit extended integer type when the char types are available.

UPDATE:

I'm not entirely comfortable with this conclusion. int8_t and uint8_t were introduced in C99. In C, it doesn't particularly matter whether they're character types or not; there are no operations for which the distinction makes a real difference. (Even putc(), the lowest-level character output routine in standard C, takes the character to be printed as an int argument). int8_t, and uint8_t, if they're defined, will almost certainly be defined as character types -- but character types are just small integer types.

C++ provides specific overloaded versions of operator<< for char, signed char, and unsigned char, so that std::cout << 'A' and std::cout << 65 produce very different output. Later, C++ adopted int8_t and uint8_t, but in such a way that, as in C, they're almost certainly character types. For most operations, this doesn't matter any more than it does in C, but for std::cout << ... it does make a difference, since this:

uint8_t x = 65;
std::cout << x;

will probably print the letter A rather than the number 65.

If you want consistent behavior, add a cast:

uint8_t x = 65;
std::cout << int(x); // or static_cast<int>(x) if you prefer

I think the root of the problem is that there's something missing from the language: very narrow integer types that are not character types.

As for the intent, I could speculate that the committee members either didn't think about the issue, or decided it wasn't worth addressing. One could argue (and I would) that the benefits of adding the [u]int*_t types to the standard outweighs the inconvenience of their rather odd behavior with std::cout << ....

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • 3
    I tried to find a reference for a minimum size of `short` (other than at least the size of `signed char`) and I couldn't find it - could you provide a reference? – Mark B Apr 09 '13 at 20:39
  • 5
    C++ standard 3.9.1: "The signed and unsigned integer types shall satisfy the constraints given in the C standard, section 5.2.4.2.1". C 5.2.4.2.1 sets requirements for ``, including `SHRT_MIN <= -32767`, `SHRT_MAX >= +32767`, and `USHRT_MAX >= 65535` – Keith Thompson Apr 09 '13 at 20:42
  • 3
    Keep in mind that an implementation could `typedef` `int8_t` to a non-standard implementation defined type (and may well on those few platforms that use a 16-bit `char`). I think the C++11 standard is missing some necessary clarification about how these `stdint.h` types should resolve in overloads. I suspect that how these types might match for overload resolution would be implementation defined. – Michael Burr Apr 09 '13 at 20:43
  • @MichaelBurr: Yes, I could be missing something (I'm more familiar with C than with C++). C++ does permit *extended integer types*, something it borrowed from C99. Practically speaking, though, I can't think of a reason to define an extended integer type that could be a candidate for `int8_t` -- unless, I suppose, it's just to avoid making `int8_t` a character type. – Keith Thompson Apr 09 '13 at 20:47
  • @Keith: the one case I can think of is when the 'native' `char` type is 16-bits. Even in that case, if there were overloads for both `signed char` and `int`, which would be the preferred overload? – Michael Burr Apr 09 '13 at 20:57
  • 8
    @MichaelBurr: If `char` is 16 bits, then `CHAR_BIT==16`, and a byte is by definition 16 bits. Apart from bit fields, you can't have an integer type smaller than 1 byte. So in that case there would be no `int8_t`. (If you're not convinced, think about `sizeof (int8_t)`.) – Keith Thompson Apr 09 '13 at 21:06
  • @KeithThompson: Ah, yes. I forgot about that detail. – Michael Burr Apr 09 '13 at 21:09
  • @KeithThompson Would you be able to confirm with any authority that the standard has *not* taken measures to make my program's output well-defined? I don't want to presume that you're saying that. – Drew Dormann Apr 11 '13 at 01:26
  • @DrewDormann: Not with any authority, no. I've added several rambling paragraphs to my answer. – Keith Thompson Apr 11 '13 at 01:51
  • Even if `int8_t` is not a typedef for `signed char`, `ostream::operator<<(char)` is likely to be the best match during overload resolution, and so you'd get character output anyway. – Ben Voigt Apr 11 '13 at 01:57
  • @BenVoigt: What could it be a typedef for other than `signed char` or `char`? – Keith Thompson Apr 11 '13 at 01:58
  • @Keith: That `__int8` you mentioned in your answer (for example, MSVC has one). But the Standard `ostream` class doesn't have a formatted insertion operator overload for `__int8`... so it'll use the one for `char` anyway. – Ben Voigt Apr 11 '13 at 02:05
  • @BenVoigt: Ok. I don't know the "best match" rules very well, but this isn't the place to go into it. – Keith Thompson Apr 11 '13 at 02:11
  • 2
    @BenVoigt [over.ics.rank]/4: "Standard conversion sequences are ordered by their ranks: an Exact Match is a better conversion than a Promotion, which is a better conversion than a Conversion." In this case, a promotion would be [conv.prom]/1, i.e., a promotion to `(unsigned) int` (from a type with lower conversion rank). A conversion would be [conv.integral]/1, i.e. a conversion to any integer type (including `char`). Only if `char == uint8_t`, the most viable function should be `operator<< (char)` AFAIK, else `operator<< (int)`. – dyp Apr 14 '13 at 17:15
  • @BenVoigt See Daniel's answer. – dyp Apr 14 '13 at 21:49
  • @Ben Voigt: MSVC has, indeed, an intrinsic `__int8` type which is not a `typedef` for a `char`. However, "`__int8` data type is synonymous with type `char`". See [here](http://msdn.microsoft.com/en-GB/library/29dh1w7z.aspx). You can check this by declaring a variable of type `__int8` and see that the debugger says that it's a `char`. This is rather confusing, I must say. I have no idea why MS has done it. – Cassio Neri Apr 16 '13 at 00:36
  • 1
    @CassioNeri: `__int8` and friends presumably predate C99's ``, which introduced `int8_t` et al. It sounds like, rather than being a typedef, `__int8` is an implementation-defined keyword that names the same type that `char` does, similar to the way `int` and `signed` name the same type. – Keith Thompson Apr 16 '13 at 00:49
  • 2
    Note that the function style cast and `static_cast` are not equivalent in the last example (the function-style cast may legally cast away constness or reinterpret) – Billy ONeal Apr 18 '13 at 18:33
  • The question of whether uint8_t is a character type is important in C because character types have unique aliasing behaviors. A lot of code requires a type which is guaranteed to be exactly 8 bits with no padding and shares `unsigned char`'s exemption from aliasing rules, and uses `uint8_t` for that purpose. It would have been helpful if c99 had defined a name like `uinta8_t` for an 8-bit type with aliasing support, [allowing uint8_t to be an extended type without aliasing support] but since it didn't, a lot of code uses `uint8_t` even when aliasing support is needed. – supercat Jan 09 '17 at 20:05
8

I'll answer your questions in reverse order.

Does the standard specify whether this type can or will be a character type?

Short answer: int8_t is signed char in the most popular platforms (GCC/Intel/Clang on Linux and Visual Studio on Windows) but might be something else in others.

The long answer follows.

Section 18.4.1 of the C++11 Standard provides the synopsis of <cstdint> which includes the following

typedef signed integer type int8_t; //optional

Later in the same section, paragraph 2, it says

The header [<cstdint>] defines all functions, types, and macros the same as 7.18 in the C standard.

where C standard means C99 as per 1.1/2:

C ++ is a general purpose programming language based on the C programming language as described in ISO/IEC 9899:1999 Programming languages — C (hereinafter referred to as the C standard).

Hence, the definition of int8_t is to be found in Section 7.18 of the C99 standard. More precisely, C99's Section 7.18.1.1 says

The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two’s complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits.

In addition, C99's Section 6.2.5/4 says

There are five standard signed integer types, designated as signed char, short int, int, long int, and long long int. (These and other types may be designated in several additional ways, as described in 6.7.2.) There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types.

Finally, C99's Section 5.2.4.2.1 imposes minimum sizes for standard signed integer types. Excluding signed char, all others are at least 16 bits long.

Therefore, int8_t is either signed char or an 8 bits long extended (non standard) signed integer type.

Both glibc (the GNU C library) and Visual Studio C library define int8_t as signed char. Intel and Clang, at least on Linux, also use libc and hence, the same applies to them. Therefore, in the most popular platforms int8_t is signed char.

Given this C++11 program, should I expect to see a number or a letter? Or not make expectations?

Short answer: In the most popular platforms (GCC/Intel/Clang on Linux and Visual Studio on Windows) you will certainly see the letter 'A'. In other platforms you might get see 65 though. (Thanks to DyP for pointing this out to me.)

In the sequel, all references are to the C++11 standard (current draft, N3485).

Section 27.4.1 provides the synopsis of <iostream>, in particular, it states the declaration of cout:

extern ostream cout;

Now, ostream is a typedef for a template specialization of basic_ostream as per Section 27.7.1:

template <class charT, class traits = char_traits<charT> >
class basic_ostream;

typedef basic_ostream<char> ostream;

Section 27.7.3.6.4 provides the following declaration:

template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>& out, signed char c);

If int8_t is signed char then it's this overload that's going to be called. The same section also specifies that the effect of this call is printing the character (not the number).

Now, let's consider the case where int8_t is an extended signed integer type. Obviously, the standard doesn't specify overloads of operator<<() for non standard types but thanks to promotions and convertions one of the provided overloads might accept the call. Indeed, int is at least 16 bits long and can represent all the values of int8_t. Then 4.5/1 gives that int8_t can be promoted to int. On the other hand, 4.7/1 and 4.7/2 gives that int8_t can be converted to signed char. Finally, 13.3.3.1.1 yields that promotion is favored over convertion during overload resolution. Therefore, the following overload (declared in in 23.7.3.1)

basic_ostream& basic_ostream::operator<<(int n);

will be called. This means that, this code

int8_t i = 65;
std::cout << i;

will print 65.

Update:

1. Corrected the post following DyP's comment.

2. Added the following comments on the possibility of int8_t be a typedef for char.

As said, the C99 standard (Section 6.2.5/4 quoted above) defines 5 standard signed integer types (char is not one of them) and allows implementations to add their onw which are referred as non standard signed integer types. The C++ standard reinforces that definition in Section 3.9.1/2:

There are five standard signed integer types : “signed char”, “short int”, “int”, “long int”, and “long long int” [...] There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types.

Later, in the same section, paragraph 7 says:

Types bool, char, char16_t, char32_t, wchar_t, and the signed and unsigned integer types are collectively called integral types. A synonym for integral type is integer type.

Therefore, char is an integer type but char is neither a signed integer type nor an unsigned integer type and Section 18.4.1 (quoted above) says that int8_t, when present, is a typedef for a signed integer type.

What might be confusing is that, depending on the implementation, char can take the same values as a signed char. In particular, char might have a sign but it's still not a signed char. This is explicitly said in Section 3.9.1/1:

[...] Plain char, signed char, and unsigned char are three distinct types. [...] In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

This also implies that char is not a signed integer type as defined by 3.9.1/2.

3. I admit that my interpretation and, specifically, the sentence "char is neither a signed integer type nor an unsigned integer type" is a bit controversial.

To strength my case, I would like to add that Stephan T. Lavavej said the very same thing here and Johannes Schaub - litb also used the same sentence in a comment on this post.

Community
  • 1
  • 1
Cassio Neri
  • 19,583
  • 7
  • 46
  • 68
  • 2
    I don't think it'll fail to compile if `int8_t != signed char` for the following two reasons: 1) `int8_t` could be a `char` (a distinct type different from `signed char`). 2) Even if `int8_t` was an extended integer type, it would be an integer type, see [basic.fundamental]/2+7. And as [conv.prom]/1 tells us, it could be promoted either to `int` or `unsigned int` (as `int` must be >= `char` >= 8 bits). Also see Daniel's answer. – dyp Apr 16 '13 at 14:54
  • @DyP: You are right. Thanks to integral promotions/conversion there will be an overload of `operator<<` that can take the call. Thanks for pointing this out. I'll correct the post. However, as far as I understand, `int8_t` cannot be a `char`. I'll add more information on this point. Please, let me know what you think. – Cassio Neri Apr 17 '13 at 10:49
  • The state of `char` is not entirely clear to me. It's an _integral type_ but neither a _signed_ nor _unsigned integer type_. Could it be a typedef for an extended integer type? – dyp Apr 17 '13 at 12:40
  • I already worked this out with Daniel: [over.ics.rank] says that an integral Promotion [conv.prom] will be preferred over an integral Conversion [conv.integral] when computing the best viable function (overload). And an integral promotion of `int8_t` to `int` is certainly possible (`int` >= 16 bit); same for `uint8_t` and `unsigned int`. Therefore, if it has to be converted, it'll be promoted to an `int` and the output will be `65` (or whatever number) rather than `A`. Plus I'm still not sure whether `typedef extended_int char; typedef extended_int int8_t;` is legal or not. – dyp Apr 17 '13 at 17:33
  • @DyP: `typedef extended_int char` is illegal because `char` is a standard type and standard and extended types must be different. `typedef extended_int int8_t` is legal. I believe, but I'm not sure (as explained in my post), that `uint8_t` can be promoted to `signed char` and, hence, 'A' will be printed (not `65`). – Cassio Neri Apr 22 '13 at 17:29
  • 1
    "standard and extended types must be different" Could you please provide a reference? I'd appreciate that. `uint8_t` cannot be _promoted_ to `signed char`, it can only be promoted to either `int` or `unsigned int` 4.5[conv.prom]/1; but as C specifies `int` is >= 16 bit, it can only be promoted to `int`. It can be _converted_ to `signed char`, though, but promotion will be preferred during overload resolution [over.ics.rank]. – dyp Apr 22 '13 at 18:21
  • @DyP You're right: if `uint_8` isn't `signed char`, then it can be promoted to `int`, converted to `signed char` and promotion is preferable during overload resolution. Hence, `65` will be printed. I'll update soon. Regarding standard and extended types being different: it's implied by [conv.rank]/1. More specifically, it says, "The rank of any standard integer type shall be greater than the rank of any extended integer type with the same size." If a standard integer type was a typedef for an extended integer type, then they would have the same size and rank. Thanks for the nice discussion. – Cassio Neri Apr 22 '13 at 22:42
  • Thank you, too, for the discussion and the reference. I also thought about the ranking, but I didn't find any explicit statement like in [basic.fundamental]. It _is_ a rather far-fetched case, and I tend to agree the `typedef extended_type char;` contradicts the Std. – dyp Apr 22 '13 at 22:51
5

The working draft copy I have, N3376, specifies in [cstdint.syn] § 18.4.1 that the int types are typically typedefs.

namespace std {
typedef signed integer type int8_t; // optional
typedef signed integer type int16_t; // optional
typedef signed integer type int32_t; // optional
typedef signed integer type int64_t; // optional
typedef signed integer type int_fast8_t;
typedef signed integer type int_fast16_t;
typedef signed integer type int_fast32_t;
typedef signed integer type int_fast64_t;
typedef signed integer type int_least8_t;
typedef signed integer type int_least16_t;
typedef signed integer type int_least32_t;
typedef signed integer type int_least64_t;
typedef signed integer type intmax_t;
typedef signed integer type intptr_t; // optional
typedef unsigned integer type uint8_t; // optional
typedef unsigned integer type uint16_t; // optional
typedef unsigned integer type uint32_t; // optional
typedef unsigned integer type uint64_t; // optional
typedef unsigned integer type uint_fast8_t;
typedef unsigned integer type uint_fast16_t;
typedef unsigned integer type uint_fast32_t;
typedef unsigned integer type uint_fast64_t;
typedef unsigned integer type uint_least8_t;
typedef unsigned integer type uint_least16_t;
typedef unsigned integer type uint_least32_t;
typedef unsigned integer type uint_least64_t;
typedef unsigned integer type uintmax_t;
typedef unsigned integer type uintptr_t; // optional
} // namespace std

Since the only requirement made is that it must be 8 bits, then typedef to a char is acceptable.

Rapptz
  • 20,807
  • 5
  • 72
  • 86
-2

char/signed char/unsigned char are three different types, and a char is not always 8 bits. on most platform they are all 8-bits integer, but std::ostream only defined char version of >> for behaviors like scanf("%c", ...).

richselian
  • 731
  • 4
  • 18
  • 1
    They are exactly 8 bits on every platform that defines `int8_t`. – Ben Voigt Apr 14 '13 at 21:51
  • @BenVoigt Not exactly, `CHAR_BIT` in defines how many bits in a `char`. Though I haven't seen any platform with CHAR_BIT value other than 8. – richselian Apr 15 '13 at 05:31
  • 1
    if `CHAR_BIT` is greater than 8, then `int8_t` does not exist on the platform. The Standard does not allow `CHAR_BIT` to be less than 8. – Ben Voigt Apr 15 '13 at 14:19