23

I've made a simple program and compiled it with GCC 4.4/4.5 as follows:

int main ()
{
  char u = 10;
  char x = 'x';
  char i = u + x;

  return 0;
}

g++ -c -Wconversion a.cpp

And I've got the following:

a.cpp: In function ‘int main()’:
a.cpp:5:16: warning: conversion to ‘char’ from ‘int’ may alter its value

The same warning I've got for the following code:

  unsigned short u = 10;
  unsigned short x = 0;
  unsigned short i = u + x;

a.cpp: In function ‘int main()’:
a.cpp:5:16: warning: conversion to ‘short unsigned int’ from ‘int’ may alter its value

Could anyone please explain me why addition of two chars (or two unsigned shorts) produces int? Is it a compiler bug or is it standard compliant?

Thanks.

hiddensunset4
  • 5,825
  • 3
  • 39
  • 61
Rom098
  • 2,445
  • 4
  • 35
  • 52
  • I wonder if there is some compiler optimisation going on here whereby the 'u' in your addition is just getting substituted with the literal value 10. However that seems rather buggy and non-standards-compiant. – Nick Jan 27 '11 at 09:43

3 Answers3

27

What you're seeing is the result of the so-called "usual arithmetic conversions" that occur during arithmetic expressions, particularly those that are binary in nature (take two arguments).

This is described in §5/9:

Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:

— If either operand is of type long double, the other shall be converted tolong double.
— Otherwise, if either operand is double, the other shall be converted to double.
— Otherwise, if either operand is float, the other shall be converted to float.
— Otherwise, the integral promotions (4.5) shall be performed on both operands.54)
— Then, if either operand is unsigned long the other shall be converted to unsigned long.
— Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long int.
— Otherwise, if either operand is long, the other shall be converted to long.
— Otherwise, if either operand is unsigned, the other shall be converted to unsigned.

[Note: otherwise, the only remaining case is that both operands are int]

The promotions alluded to in §4.5 are:

1 An rvalue of type char, signed char, unsigned char, short int, or unsigned short intcan be converted to an rvalue of type int if int can represent all the values of the source type; otherwise, the source rvalue can be converted to an rvalue of type unsigned int.

2 An rvalue of type wchar_t (3.9.1) or an enumeration type (7.2) can be converted to an rvalue of the first of the following types that can represent all the values of its underlying type: int, unsigned int, long, or unsigned long.

3 An rvalue for an integral bit-field (9.6) can be converted to an rvalue of type int if int can represent all the values of the bit-field; otherwise, it can be converted to unsigned int if unsigned int can represent all the values of the bit-field. If the bit-field is larger yet, no integral promotion applies to it. If the bit-field has an enumerated type, it is treated as any other value of that type for promotion purposes.

4 An rvalue of type bool can be converted to an rvalue of type int, with false becoming zero and true becoming one.

5 These conversions are called integral promotions.

From here, sections such as "Multiplicative operators" or "Additive operators" all have the phrase: "The usual arithmetic conversions are performed..." to specify the type of the expression.

In other words, when you do integral arithmetic the type is determined with the categories above. In your case, the promotion is covered by §4.5/1 and the type of the expressions are int.

GManNickG
  • 494,350
  • 52
  • 494
  • 543
  • 2
    Yes, thanks. However I thought that §4.5/1 says "can be converted", not "must"... Does it mean that other C++ compiers may produce char, not int? – Rom098 Jan 27 '11 at 10:07
  • @Roman: What it's saying is "this type can become this other type" as a definitive statement (a requirement), then it calls the act of actually making that conversion "integral promotions". So when it says "perform integral promotions", you know `char` will be promoted to an `int` (or `unsigned int`) because, as it demanded, it *is* convertible to such a type ("can be"). – GManNickG Jan 27 '11 at 10:12
  • Thanks, I see. But it looks like it is useful for compiler developers, not for other developers (for me, for example). To avoid this warning I need to write something like this: char res = char(u + x); Seems not too comfortable. – Rom098 Jan 27 '11 at 10:30
  • @Roman: It has rationale behind it. Why do you want the to be a `char`? – GManNickG Jan 27 '11 at 10:34
  • Well, it doesn't matter whether it is char or unsigned short. The point is I've got some legacy code which uses arrays of small integer types. So for example addition of two arrays leads to the warning after I start using GCC 4.4 instead of an old compiler version. – Rom098 Jan 27 '11 at 10:37
  • @Roman: Ah, ye olde legacy code. :) Can't help you there, then. – GManNickG Jan 27 '11 at 10:43
  • Actually I don't want to modify a lot of code just to avoid the warning. – Rom098 Jan 27 '11 at 10:43
  • This warning is really useful to find potential bugs in code, except the described case. There is no doubt that the conversion is standard-compliant, but the question is should a compiler warn in this case or not? If no, I shouldn't modify my code. At the moment I'm trying find any patches or bug reports about the case on gcc.gnu.org. Do you know about such behavior of other compilers? – Rom098 Jan 27 '11 at 11:16
  • @Roman: Yes, this is completely normal and good, it's telling you the conversion might lose data. If you're certain it won't, the correct thing to do is to cast the result (well, more correct is to use a wider data type in the first place). Note correct doesn't mean easy. – GManNickG Jan 27 '11 at 11:45
5

When you do any arithmetic operation on char type, the result it returns is of int type.

See this:

char c = 'A';
cout << sizeof(c) << endl;
cout << sizeof(+c) << endl;
cout << sizeof(-c) << endl;
cout << sizeof(c-c) << endl;
cout << sizeof(c+c) << endl;

Output:

1
4
4
4
4

Demonstration at ideone : http://www.ideone.com/jNTMm

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 1
    Yep, and it makes sense when you think about it. Adding two `char`s together could easily overflow the result, but not if the result type is at least twice the width of the original one! – Lightness Races in Orbit Jan 27 '11 at 10:14
  • @Tomalak Geret'kal: Yes, that is the rationale behind it. Thanks for mentioning that. :-) – Nawaz Jan 27 '11 at 10:16
  • 2
    @Tomalak Geret'kal: There's no guarantee that `sizeof(int)>sizeof(char)`. The rationale was (as far as I can see) that addition is done by the CPU, and many RISC CPU's only do full-width addition. You 'd end with a 16 or 32 bits value in a register. Limiting it to `char` width would take an extra operation. – MSalters Jan 27 '11 at 10:37
  • @Tomalak: Actually I doubt it's because of a potential overflow. After all `int i; sizeof(i+i);` is `4` too, same of `short s; sizeof(s+s)`. I think it's got more to do with operating with `int`s whenever possible (ie when int is large enough) than covering overflow. – Matthieu M. Jan 27 '11 at 10:40
  • @MatthieuM: Actually the result of `sizeof` there is governed by the platform/implementation. – Lightness Races in Orbit Jan 27 '11 at 12:09
  • @MSalters: I know there's no guarantee of it, but it seems bloody likely. Your explanation is more logical though. I was only really positing how the situation _makes sense_, not necessarily _why_ it was made that way. – Lightness Races in Orbit Jan 27 '11 at 12:10
  • @Thomalak: I agree, it is not at all mandated by the standard (but it is permitted), it is simply a liberty that compilers can take to optimize operations. – Matthieu M. Jan 27 '11 at 12:19
3

when you are adding these two characters with each other they are first being promoted to int.

The result of an addition is an rvalue which is implicitly promoted to type int if necessary, and if an int can contain the resulting value. This is true on any platform where sizeof(int) > sizeof(char). But beware of the fact that char might be treated as signed char by your compiler.

These links can be of further help - wiki and securecoding

ayush
  • 14,350
  • 11
  • 53
  • 100
  • 1
    What document is describing this behavior? C++ Standard or...? Could you please give a link to it? – Rom098 Jan 27 '11 at 09:57