Why do implementations of "stdint.h" disagree on the definition of UINT8_C?

Question

The UINT8_C macro is defined in "stdint.h", with the following specification: The macro UINTN_C(value) shall expand to an integer constant expression corresponding to the type uint_leastN_t.

In the wild, however, implementations differ:

#define UINT8_C(value) ((uint8_t) __CONCAT(value, U))  // AVR-libc
#define UINT8_C(x_)    (static_cast<std::uint8_t>(x_)) // QP/C++
#define UINT8_C(c)     c                               // GNU C Library

The first two implementations seem roughly equivalent, but the third one behaves differently: for example, the following program prints 1 with AVR-libc and QP/C++, but -1 with glibc (because right shifts on signed values propagate the sign bit).

std::cout << (UINT8_C(-1) >> 7) << std::endl; // prints -1 in glibc

The implementation of UINT16_C displays the same behavior, but not UINT32_C, because its definition includes the U suffix:

#define UINT32_C(c) c ## U

Interestingly, glibc's definition of UINT8_C changed in 2006, due to a bug report. The previous definition was #define UINT8_C(c) c ## U, but that produced incorrect output (false) on -1 < UINT8_C(0) due to integer promotion rules.

Are all three definitions correct according to the standard? Are there other differences (besides the handling of negative constants) between these three implementations?

In C++ you want to `#include `, not `stdint.h` - just saying. — Jesper Juhl, Aug 08 '19 at 16:40
Implementations can't be controlled by the standard; the implementation can do what it likes, but is supposed to be guided by the standard. Implementations actually do their own thing. It is not hard to argue that the GNU C Library version isn't very accurate, but in a C (as opposed to C++) library, it is hard to know when it will make a difference (since almost any use of the macro will be expanded to `int` before anything further happens). — Jonathan Leffler, Aug 08 '19 at 16:40
Your example isn't very compelling. Right-shifting a negative value has an implementation-defined result until C++20. — chris, Aug 08 '19 at 16:44
@chris That's the point: Casting -1 to an unsigned type should theoretically yield the maximum representable value of that type (and subsequently behave just fine when right-shifted). — Max Langhof, Aug 08 '19 at 16:48
Since requirement is "expand to an integer constant expression corresponding to the type uint_leastN_t", I'd expect a cast of whatever is within as in `((uint_least8_t) (c))` — chux - Reinstate Monica, Aug 08 '19 at 16:48
@MaxLanghof, My bad, I glossed over the sign mismatch of the value and `UINT8_C`. — chris, Aug 08 '19 at 16:56
@chux Any implementation that included a cast in the expansion would be non-conforming. — Ian Abbott, Aug 08 '19 at 21:45
@IanAbbott Why? This is fine: `((uint_least8_t) + (c))` (this was pointed out by Florian Weimer on the bug thread) — Clément, Aug 08 '19 at 22:30
Because casts cannot be used in `#if` preprocessing directives. Also re-read the comment in the bug report. You have it negated. — Ian Abbott, Aug 08 '19 at 22:36
Or you have it at least half negated. The comment by Florian Weimer in the bug report looks a bit dodgy. — Ian Abbott, Aug 08 '19 at 22:43
As a note, Visual C++ (at least in the 2010 and 2017 versions, I don't have any other installs on hand to check) uses `#define UINT8_C(x) (x)`, similarly to the GCC implementation. — Justin Time - Reinstate Monica, Aug 09 '19 at 01:21
@IanAbbott 1) Without a cast, how would code conform to "expand to an integer constant expression corresponding to the type uint_leastN_t"? 2) The libraries certainly can use a `cast`, we cannot - or other language extensions. Little requires a compiler/library pair from ignoring casts in `#` processing - if it was made to do so. It is just that code we write should not employ a cast. — chux - Reinstate Monica, Aug 12 '19 at 15:12
@chux 1) Because it is supposed to expand to an integer constant of the corresponding type *converted according to the integer promotions*. (This implies `sizeof(UINT8_C(0) == sizeof(int)`.) 2) Sure, the compiler's standard libraries could use some hidden compiler built-in magic ignored by the preprocessor `#if` directive to make things work, but it shouldn't need to. — Ian Abbott, Aug 12 '19 at 15:37
@IanAbbott I tend to agree with your #1 statement, yet the wording of the spec seems too easy to read otherwise. — chux - Reinstate Monica, Aug 12 '19 at 15:43
@chux Although the parameter of `#if` is just a *constant-expression* (which can also include the `defined` operator), I think C11 note 166 (referenced by 6.10.1p1) comes into play: "Because the controlling constant expression is evaluated during translation phase 4, all identifiers either are or are not macro names -- there simply are no keywords, enumeration constants, etc." So at translation phase 4 it simply would not know what to do with a cast, and certainly wouldn't know anything about `typedef`. — Ian Abbott, Aug 12 '19 at 16:04

Ian Abbott · Accepted Answer · 2019-08-08T19:04:40.653

If an int can represent all the values of a uint_least8_t then the GNU implementation of the UINT8_C(value) macro as #define UINT8_C(c) c conforms to the C standard.

As per C11 7.20.4 Macros for integer constants paragraph 2:

The argument in any instance of these macros shall be an unsuffixed integer constant (as defined in 6.4.4.1) with a value that does not exceed the limits for the corresponding type.

For example, if UINT_LEAST8_MAX is 255, the following usage examples are legal:

UINT8_C(0)
UINT8_C(255)
UINT8_C(0377)
UINT8_C(0xff)

But the following usage examples result in undefined behavior:

UINT8_C(-1) — not an integer constant as defined in 6.4.4.1
UINT8_C(1u) — not an unsuffixed integer constant
UINT8_C(256) — exceeds the limits of uint_least8_t for this implementation

The signed equivalent INT8_C(-1) is also undefined behavior for the same reasons.

If UINT_LEAST8_MAX is 255, a legal instance of UINT8_C(value) will expand to an integer constant expression and its type will be int due to integer promotions, as per paragraph 3:

Each invocation of one of these macros shall expand to an integer constant expression suitable for use in #if preprocessing directives. The type of the expression shall have the same type as would an expression of the corresponding type converted according to the integer promotions. The value of the expression shall be that of the argument.

Thus for any legal invocation of UINT8_C(value), the expansion of this to value by any implementation where an int can represent all the values of uint_least8_t is perfectly standard conforming. For any illegal invocation of UINT8_C(value) you may not get the result you were expecting due to undefined behavior.

[EDIT added for completeness] As pointed out in cpplearner's answer, the other implementations of UINT8_C(value) shown in OP's question are invalid because they expand to expressions that are not suitable for use in #if processing directives.

cpplearner · Answer 2 · 2019-08-08T19:25:15.837

7

The first two implementations are not conforming to the C standard, because they don't permit UINT8_C(42) in #if directives:

#if UINT8_C(42) == 42 // <- should be a valid expression

N1570 7.20.4/3:

Each invocation of one of these macros shall expand to an integer constant expression suitable for use in #if preprocessing directives. The type of the expression shall have the same type as would an expression of the corresponding type converted according to the integer promotions. The value of the expression shall be that of the argument.

edited Aug 08 '19 at 19:25

answered Aug 08 '19 at 18:18

cpplearner

13,776
2
47
72

3

In addition, the second one doesn't conform to the C standard because it expands to C++ code. – Shawn Aug 08 '19 at 19:31
IMHO, `#define UINT8_C(value) ((uint8_t) __CONCAT(value, U)) ` is not shown to not conform. "they don't permit UINT8_C(42) in #if directives" is not demo'd. A _library_ `<*.h>` file may code things that are not generally portable. It only needs to be compile-able to the given compiler/pre-processor. `#if UINT8_C(42) == 42` may, or may not, compiler with the corresponding processor. Source code in `<*.h>` files are not good source code examples for us to generally use - they cheat. – chux - Reinstate Monica Aug 12 '19 at 16:03
@chux [6.10.1/4](http://port70.net/~nsz/c/c11/n1570.html#6.10.1p4) specifies that `#if` replaces every identifier with `0` after macro expansion, so `((uint8_t) __CONCAT(42, U))` is treated as `((0) 42U)` which is a syntax error and requires a diagnostic. I don't think there's room for any conforming extension. At least AVR-GCC does not have this kind of extension AFAIK. – cpplearner Aug 12 '19 at 17:02
1

@chux However, AVR-libc uses the `__UINT8_C` builtin when it is available, and _that_ may be a conforming implementation. – cpplearner Aug 12 '19 at 17:07

score 3 · Answer 3 · answered Aug 08 '19 at 16:48

3

The GNU C library is not correct. Per C11 7.20.4.1 Macros for minimum-width integer constants UINTN_C(value) is defined as

The macro UINTN_C(value) shall expand to an integer constant expression corresponding to the type uint_leastN_t.

So it is not proper the they just use c since c may or may not be a uint_least8_t.

answered Aug 08 '19 at 16:48

NathanOliver

171,901
28
288
402

[Agree](https://stackoverflow.com/questions/57417154/why-do-implementations-of-stdint-h-disagree-on-the-definition-of-uint8-c#comment101314382_57417154). Affects `_Generic` usage, – chux - Reinstate Monica Aug 08 '19 at 16:50
3

At least for the C language (I've no idea about C++), if the `uint_least8_t` value would be promoted to an `int` anyway, then `#define UINT8_C(c) c` is fair enough. The unsuffixed integer constant `c` is required to be within the range of `uint_least8_t` (C11 7.20.4p2), so using an out-of-range argument is UB. – Ian Abbott Aug 08 '19 at 17:30
@IanAbbott You are correct that the value is promoted to an `int` for use with the `operator >>`. The issue though is `UINT8_C(-1)` isn't "returning" the correct value. The expression should be getting at least `255` as the value to be shifted. Using `#define UINT8_C(c) c ` it will get `-1` which is not correct. – NathanOliver Aug 08 '19 at 17:37
1

But `-1` is not representable by `uint_least8_t` so it is UB,. – Ian Abbott Aug 08 '19 at 17:38
1

@IanAbbott That is not correct. Any value is representable in a `uint8_least_t` as it uses modulo 2^8 arithmetic. doing `(uint8_least_t)-1` give you the max value. – NathanOliver Aug 08 '19 at 17:39
2

C11 7.20.4p2: "The argument in any instance of these macros shall be an unsuffixed integer constant (as defined in 6.4.4.1) **with a value that does not exceed the limits for the corresponding type**." (emphasis mine) – Ian Abbott Aug 08 '19 at 17:41
@IanAbbott You would be tempted to think that `-1` is out of range, but it isn't. [There is no such thing as a negative integer literal](https://stackoverflow.com/questions/45469214/why-does-the-most-negative-int-value-cause-an-error-about-ambiguous-function-ove/45469321#45469321) so the actual value is `1`, we should get that as a `uint_least8_t`, and then we negate it with the unary `-` to make it it's max value. – NathanOliver Aug 08 '19 at 17:49
1

Then it is still UB because `-1` is not an integer constant as required by 7.20.4p2. – Ian Abbott Aug 08 '19 at 17:53
@IanAbbott I would recommend posting this as a separate answer; I think it's rather convincing. – Clément Aug 08 '19 at 18:03
@IanAbbott I'm not sure if it is going to be UB or not. All that aside, doing `std::cout << -1 * UINT8_C(1);` on a conforming implementation should print the max value of `uint_least8_t`. In GNU C it will print `-1`, so it is not conforming. – NathanOliver Aug 08 '19 at 18:09
1

Since the C++ standard seems to defer to the C standard on this matter, I fail to see why `-1 * UINT8_C(1)` would not be an `int` with value `-1` as long as `int` can represent all the values of `uint_least8_t`. – Ian Abbott Aug 08 '19 at 18:49
@IanAbbott Dang, forgot about the conversion again. All versions would give `-1`. I'm still going to keep this answer because at least in C++ we can check the type via templates or overloads and the type it should evaluate to is `uint_least8_t`. – NathanOliver Aug 08 '19 at 18:54

Why do implementations of "stdint.h" disagree on the definition of UINT8_C?

3 Answers3