5

I made a statement to a colleague of mine, which was:

"chars are automatically promoted to integers in C expressions, and that's fine for performance since CPUs work fastest with their natural word size.

I believe char promotion behavior is stated somewhere in the standard due to a char's rank.

This is the response I got back:

"Characters are not default promoted to an integer. The register size is 32 bit, but multiple byte values in a row can be packed into a single register as a compiler implementation. This is not always predictive. The only time you can verify automatic promotion is when the type is passed into the call stack when not wrapped around a structure because C standard officially needs 32-bit values in the call stack memory. A great deal of CPU architectures have optimized assembly calls for non-32 bit values, so no assumptions can be made about the CPU or compiler in this case."

I'm not sure who is right, and what to believe. What are the facts?

Trevor Hickey
  • 36,288
  • 32
  • 162
  • 271
  • 5
    *because C standard officially needs 32-bit values in the call stack memory* who on Earth said that?! – Filipe Gonçalves Sep 03 '15 at 19:05
  • Did you mean specifically arithmetic operations? If so then [Why must a short be converted to an int before arithmetic operations in C and C++?](http://stackoverflow.com/q/24371868/1708801) is relevant. – Shafik Yaghmour Sep 03 '15 at 19:06
  • 4
    This statement "C standard officially needs 32-bit values in the call stack memory" is an utter BS. C standard has no mention of number of bits. It doesn't even talk about stack. – Eugene Sh. Sep 03 '15 at 19:07
  • @ShafikYaghmour well not necessarily. What about something like: char c; if (c){}? I guess what I'm asking is, if the expression evaluates to a char and is then used/evaluated, does it get promoted to an int? – Trevor Hickey Sep 03 '15 at 19:07
  • C11 draft standard, `6.3.1.8 Usual arithmetic conversions Section 1 [...]Otherwise, the integer promotions are performed on both operands.[...]`. + `6.3.1.1 Boolean, characters, and integers Section 2 [...]If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.[...]` – EOF Sep 03 '15 at 19:07
  • 3
    Your colleague seems to be using the word "registry" when he seems to mean "register". The registry is a feature of the Windows OS, not CPUs. – Barmar Sep 03 '15 at 19:09
  • 2
    @EugeneSh. In fact there word stack does not appear in the C11 standard at all. – NathanOliver Sep 03 '15 at 19:09
  • 1
    char is default-promoted to int or unsigned int. – Michi Sep 03 '15 at 19:09
  • @EugeneSh. Right. The only thing I'm unsure about, and could use some clarification on, is when a char DOES get default promoted. – Trevor Hickey Sep 03 '15 at 19:10
  • 1
    The colleague's language is clearly poor. He's apparently using "the stack" to mean "when passed as a function argument" – Barmar Sep 03 '15 at 19:10
  • `char`s are only automatically promoted to `int`s for arithmetic operations. – Red Alert Sep 03 '15 at 19:14
  • @RedAlert That's clear to me now. Is that the ONLY time a char is automatically promoted? Arithmetic operations alone? – Trevor Hickey Sep 03 '15 at 19:15
  • @TrevorHickey by `char` are you referring to `char` variables or character literals? – Eugene Sh. Sep 03 '15 at 19:17
  • @EugeneSh. char variables – Trevor Hickey Sep 03 '15 at 19:18
  • @TrevorHickey OK than, since as you might know the character literals are `int` in fact :) – Eugene Sh. Sep 03 '15 at 19:19
  • 2
    @RedAlert: `char`s will also be converted when passed to a function without a prototype, or a variadic function, or a function with a prototype other than `char`: C11 draft standard, `6.5.2.2 Function calls`. – EOF Sep 03 '15 at 19:19
  • 1
    A large part of C's lifetime was on 16-bit machines. When C was first standardized, 16-bit machines were still the norm. So to say that the C standard requires 32-bit registers is bogus. – PC Luddite Sep 03 '15 at 19:20
  • 2
    `char` is promoted to `int`. However, if the compiler can prove it yields an identical results for the abstract machine, it can very well perform an operation with smaller sizes, e.g. `char`. For instance: `char a = 3, c = 1; c += a;` can be evaluted using a single 8-bit addition on an 8 bit CPU. But that is irrelevant for the standard. Perhaps that's what your colleague thought of. But as cited, it is very poor; as the standard does not even enforce a specific bit-size. – too honest for this site Sep 03 '15 at 19:21
  • Char c = 'a'; Printf("%d",c); – Michi Sep 03 '15 at 19:22
  • I think both statements are right. The first is about behavior, the second about implementation. I think by "C standard", the speaker meant the C ABI standard for the particular platform whose implementation he was talking about. – David Schwartz Sep 03 '15 at 19:52
  • @DavidSchwartz Hmm, so you're saying that the standard enforces char promotion, but the ABI standard may not abide to it?-- And that's ok? – Trevor Hickey Sep 03 '15 at 19:56
  • 2
    @TrevorHickey No, the reverse. The ABI may require char promotion where the standard doesn't require it (under the as-if rule). And that's okay. The standard never requires actual char promotion, just that you get the same result as if you promoted. The implementation can avoid actually doing promotion any time it can produce the same result without promoting (and as he explained, that can make sense). However for function calls, the ABI may mandate the actual promotion even where it could otherwise be avoided. – David Schwartz Sep 03 '15 at 20:05
  • @DavidSchwartz That makes sense. – Trevor Hickey Sep 03 '15 at 20:18

4 Answers4

8

chars are automatically promoted to integers in C expressions

Yes, they are. C99 section 6.3.1.8, Usual arithmetic conversions:

Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions:

  • First, if the corresponding real type of either operand is long double, the other operand is converted, without change of type domain, to a type whose corresponding real type is long double.
  • Otherwise, if the corresponding real type of either operand is double, the other operand is converted, without change of type domain, to a type whose corresponding real type is double.
  • Otherwise, if the corresponding real type of either operand is float, the other operand is converted, without change of type domain, to a type whose corresponding real type is float.62)
  • Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
    • If both operands have the same type, then no further conversion is needed.
    • Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
    • Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
    • Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
    • Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Integer promotions are described on Section 6.3.1.1.2:

The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanges by the integer promotions.

The rank of a char is less than or equal to that of an int, so char is included in here.

(As a footnote, it is mentioned that integer promotions are only applied as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, - and ~, and to both operands of the shift operators).

As mentioned in the comments, integer promotion is also performed on function-call arguments.

Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70
  • 1
    Integer promotion is also performed on function-call arguments. – EOF Sep 03 '15 at 19:24
  • Upvoted, but I think you should include a bit more of the preceding text in your quote from 6.3.1.1 for clarity: "The following may be used in an expression wherever an int or unsigned int may be used: — An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and unsigned int." (which includes char from the preceding definition) – mattnewport Sep 03 '15 at 19:29
  • 1
    @mattnewport Yes, I agree it's important to provide some context. It looked like it came out of nowhere without the introductory text. Added it to my answer, thanks for suggesting. – Filipe Gonçalves Sep 03 '15 at 19:32
4

Yes, expressions with multiple chars, like addition etc.etc. (but not stuff like the comma operator), and some other things, are done on promoted values (promoted to int). See eg. N3797, §4.5

About the statement of your colleague, there are many wrong things in it:

  • A "registry" (register) size is not generally 32 bit, not at all.

  • If a byte has 8 bit, of course a register with 32 bit can hold multiple bytes,
    but this isn´t relevant, and the compiler is not the reason why it is possible.

  • What about this is "predictive"?

  • The bit about the standard and 32 bit is completely wrong.

  • Integer promotion has nothing to do with struct

  • In the standard, there is no "stack". That the concept
    of a stack is used in reality is not mandatory (as others said).

  • He´s saying that everything needs to be 32 bit, but as CPUs
    could process other sizes too, nothing can be said for sure? What now?

...

deviantfan
  • 11,268
  • 3
  • 32
  • 49
2

C does not require a stack or specify anything about 32-bit registers.

One of the rationale of integer promotions is as CERT put it:

Integer promotions are performed to avoid arithmetic errors resulting from the overflow of intermediate values. For example:

signed char cresult, c1, c2, c3;
c1 = 100;
c2 = 3;
c3 = 4;
cresult = c1 * c2 / c3;

Note that not all operators cause their argument to be the subject of the usual arithmetic conversions, for example there is no integer promotion with the assignment operators or the cast operator.

ouah
  • 142,963
  • 15
  • 272
  • 331
0

Logically, yes, all operations are performed on promoted values. However, under the as-if rule, a compiler that can prove results are identical may choose to omit the actual promotion. Trivially if (ch==0) would require promoting ch to int but in practice this is not needed at all. An optimizer can trivially see that (int)ch is zero if and only ch is zero.

So the actual CPU performance and the different CPU flavors matter less than you'd think for performance; it's whether the optimizer can find a decent set of instructions.

MSalters
  • 173,980
  • 10
  • 155
  • 350