2

Is the following code undefined behavior according to GCC in C99 mode:

signed char c = CHAR_MAX; // assume CHAR_MAX < INT_MAX
c = c + 1;
printf("%d", c);
Natan Yellin
  • 6,063
  • 5
  • 38
  • 57
  • @DevSolar This isn't a duplicate. This question is specifically about the behavior of chars. You can see from the answers that there is an actual question here regardless of the difference between signed and unsigned overflow. – Natan Yellin Nov 13 '17 at 12:06
  • 2
    @DevSolar Value of `CHAR_MAX` is irrelevant here. This code won't have overflow because `char` is converted to `int`, and it's always at least 16 bits in size. Behaviour should be evaluated by type conversion rules, not overflow rules. – user694733 Nov 13 '17 at 13:23
  • 3
    That's not a `signed char` - that's a `char` (and it may be a signed or unsigned type, but it's distinct from both `signed char` and `unsigned char`). – Toby Speight Nov 13 '17 at 13:52
  • Because of usual arithmetic promotions, it's only undefined if `char` is both `signed` and at least as large as `int` (highly unlikely). I don't think you can get C to perform that increment with unpromoted `char`s. – Petr Skocik Nov 13 '17 at 14:03
  • 2
    @NatanYellin The post has a loose usage of `signed char` in the title and then codes with `char`. Using `signed char` for both or `char` for both would have added clarity to the question as C treats `char` and `signed char` as distinct types. – chux - Reinstate Monica Nov 13 '17 at 16:34
  • 1
    @NatanYellin With the title asking about overflow and code using +1, and coding `char c = CHAR_MAX;` would make more sense as 127 is not necessarily the max value of a `char`. 127 is the minimum value `CHAR_MAX` could have. " – chux - Reinstate Monica Nov 13 '17 at 16:38
  • I've updated the post according to comments. – Natan Yellin Nov 15 '17 at 14:58

2 Answers2

8

signed char overflow does cause undefined behavior, but that is not what happens in the posted code.

With c = c + 1, the integer promotions are performed before the addition, so c is promoted to int in the expression on the right. Since 128 is less than INT_MAX, this addition occurs without incident. Note that char is typically narrower than int, but on rare systems char and int may be the same width. In either case a char is promoted to int in arithmetic expressions.

When the assignment to c is then made, if plain char is unsigned on the system in question, the result of the addition is less than UCHAR_MAX (which must be at least 255) and this value remains unchanged in the conversion and assignment to c.

If instead plain char is signed, the result of the addition is converted to a signed char value before assignment. Here, if the result of the addition can't be represented in a signed char the conversion "is implementation-defined, or an implementation-defined signal is raised," according to §6.3.1.3/3 of the Standard. SCHAR_MAX must be at least 127, and if this is the case then the behavior is implementation-defined for the values in the posted code when plain char is signed.

The behavior is not undefined for the code in question, but is implementation-defined.

ad absurdum
  • 19,498
  • 5
  • 37
  • 60
  • Is there any way to avoid integer promotions and actually cause a signed char overflow? – Natan Yellin Nov 14 '17 at 22:39
  • 1
    @NatanYellin -- Good question! There is no way to avoid the integer promotions. Since all of the operators that might be used to increase the value of a `signed char` first perform the integer promotions on their operands (e.g. `+`, `*`, `+=`, `++`), it seems that overflow in a calculation involving `signed char` would be possible, but the overflow would be in an `int` expression after the promotions: not actually `signed char` overflow. – ad absurdum Nov 15 '17 at 10:49
  • In other words, signed char overflow is well-defined in the range [-255,255]? – Natan Yellin Nov 15 '17 at 14:51
  • 1
    @NatanYellin -- Well, the range of a `signed char` is at least [-127, 127]. `signed char` overflow is not well-defined, but undefined behavior since all signed integer overflow is undefined behavior. Yet there is no way to achieve `signed char` overflow (that I can think of) since any arithmetic expression involving `signed char` values would first promote the values to `int`. Signed integer overflow would then be possible with the `int` values, with undefined behavior, but it would not be `signed char` overflow. – ad absurdum Nov 15 '17 at 14:57
  • 1
    @NatanYellin -- Increasing a `signed char` value outside of the range of a `signed char`, as in the posted code, simply results in an `int` value, so long as the new value is in the range of an `int`. The assignment back to `signed char` is well-defined, but also implementation-dependent. – ad absurdum Nov 15 '17 at 15:00
4

No, it has implementation-defined behavior, either storing an implementation-defined result or possibly raising a signal.

Firstly, the usual arithmetic conversions are applied to the operands. This converts the operands to type int and so the computation is performed in type int. The result value 128 is guaranteed to be representable in int, since INT_MAX is guaranteed to be at least 32767 (5.2.4.2.1 Sizes of integer types), so next a value 128 in type int must be converted to type char to be stored in c. If char is unsigned, CHAR_MAX is guaranteed to be at least 255; otherwise, if SCHAR_MAX takes its minimal value of 127:

6.3.1.3 Signed and unsigned integers

When a value with integer type is converted to another integer type, [if] the new type is signed and the value cannot be represented in it[,] either the result is implementation-defined or an implementation-defined signal is raised.

In particular, gcc can be configured to treat char as either signed or unsigned (-f\[un\]signed-char); by default it will pick the appropriate configuration for the target platform ABI, if any. If a signed char is selected, all current gcc target platforms that I am aware of have an 8-bit byte (some obsolete targets such as AT&T DSP1600 had a 16-bit byte), so it will have range [-128, 127] (8-bit, two's complement) and gcc will apply modulo arithmetic yielding -128 as the result:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

ecatmur
  • 152,476
  • 27
  • 293
  • 366