41

Consider the following code :

#include <iostream>
#include <type_traits>

int main(int argc, char* argv[])
{
    std::cout<<"std::is_same<int, int>::value = "<<std::is_same<int, int>::value<<std::endl;
    std::cout<<"std::is_same<int, signed int>::value = "<<std::is_same<int, signed int>::value<<std::endl;
    std::cout<<"std::is_same<int, unsigned int>::value = "<<std::is_same<int, unsigned int>::value<<std::endl;
    std::cout<<"std::is_same<signed int, int>::value = "<<std::is_same<signed int, int>::value<<std::endl;
    std::cout<<"std::is_same<signed int, signed int>::value = "<<std::is_same<signed int, signed int>::value<<std::endl;
    std::cout<<"std::is_same<signed int, unsigned int>::value = "<<std::is_same<signed int, unsigned int>::value<<std::endl;
    std::cout<<"std::is_same<unsigned int, int>::value = "<<std::is_same<unsigned int, int>::value<<std::endl;
    std::cout<<"std::is_same<unsigned int, signed int>::value = "<<std::is_same<unsigned int, signed int>::value<<std::endl;
    std::cout<<"std::is_same<unsigned int, unsigned int>::value = "<<std::is_same<unsigned int, unsigned int>::value<<std::endl;
    std::cout<<"----"<<std::endl;
    std::cout<<"std::is_same<char, char>::value = "<<std::is_same<char, char>::value<<std::endl;
    std::cout<<"std::is_same<char, signed char>::value = "<<std::is_same<char, signed char>::value<<std::endl;
    std::cout<<"std::is_same<char, unsigned char>::value = "<<std::is_same<char, unsigned char>::value<<std::endl;
    std::cout<<"std::is_same<signed char, char>::value = "<<std::is_same<signed char, char>::value<<std::endl;
    std::cout<<"std::is_same<signed char, signed char>::value = "<<std::is_same<signed char, signed char>::value<<std::endl;
    std::cout<<"std::is_same<signed char, unsigned char>::value = "<<std::is_same<signed char, unsigned char>::value<<std::endl;
    std::cout<<"std::is_same<unsigned char, char>::value = "<<std::is_same<unsigned char, char>::value<<std::endl;
    std::cout<<"std::is_same<unsigned char, signed char>::value = "<<std::is_same<unsigned char, signed char>::value<<std::endl;
    std::cout<<"std::is_same<unsigned char, unsigned char>::value = "<<std::is_same<unsigned char, unsigned char>::value<<std::endl;
    return 0;
}

The result is :

std::is_same<int, int>::value = 1
std::is_same<int, signed int>::value = 1
std::is_same<int, unsigned int>::value = 0
std::is_same<signed int, int>::value = 1
std::is_same<signed int, signed int>::value = 1
std::is_same<signed int, unsigned int>::value = 0
std::is_same<unsigned int, int>::value = 0
std::is_same<unsigned int, signed int>::value = 0
std::is_same<unsigned int, unsigned int>::value = 1
----
std::is_same<char, char>::value = 1
std::is_same<char, signed char>::value = 0
std::is_same<char, unsigned char>::value = 0
std::is_same<signed char, char>::value = 0
std::is_same<signed char, signed char>::value = 1
std::is_same<signed char, unsigned char>::value = 0
std::is_same<unsigned char, char>::value = 0
std::is_same<unsigned char, signed char>::value = 0
std::is_same<unsigned char, unsigned char>::value = 1 

Which means that int and signed int are considered as the same type, but not char and signed char. Why is that ?

And if I can transform a char into signed char using make_signed, how to do the opposite (transform a signed char to a char) ?

Vincent
  • 57,703
  • 61
  • 205
  • 388
  • Interesting, I knew `char` could be signed or unsigned, but I thought it would at least be equivalent to one of those. – chris May 12 '13 at 01:45
  • 2
    Possible duplicate of [char!=(signed char), char!=(unsigned char)](https://stackoverflow.com/questions/436513/char-signed-char-char-unsigned-char) – jtbandes Jul 15 '17 at 07:38
  • Possible duplicate of [What is an unsigned char?](https://stackoverflow.com/questions/75191/what-is-an-unsigned-char) – phuclv Jun 23 '19 at 05:09
  • 1
    other duplicates: [What does it mean for a char to be signed?](https://stackoverflow.com/q/451375/995714), [Difference between signed / unsigned char](https://stackoverflow.com/q/4337217/995714), [What is signed char?](https://stackoverflow.com/q/21545008/995714) – phuclv Jun 23 '19 at 05:11

4 Answers4

33

There are three distinct basic character types: char, signed char and unsigned char. Although there are three character types, there are only two representations: signed and unsigned. The (plain)char uses one of these representations. Which of the other two character representations is equivalent to char depends on the compiler.

In an unsigned type, all the bits represent the value. For example, an 8-bit unsigned char can hold the values from 0 through 255 inclusive.

The standard does not define how signed types are represented, but does specify that the range should be evenly divided between positive and negative values. Hence an 8-bit signed char is guaranteed to be able to hold values from -127 through 127.


So how to decide which Type to use?

Computations using char are usually problematic. Char is by default signed on some machines and unsigned on others. So we should not use (plain)char in arithmetic expressions. Use it only to hold characters. If you need a tiny integer, explicitly specify either signed char or unsigned char.

Excerpts taken from C++ Primer 5th edition, p. 66.

Nicolas
  • 6,611
  • 3
  • 29
  • 73
Ankit Gupta
  • 757
  • 7
  • 13
  • 10
    I know this post is from long time ago but this answer is identical to a paragraph in *C++ Primer*, second chapter. – Mia Aug 25 '17 at 10:16
  • @JerieWang I made an edit, this was effectively C++ Primer verbatim. – Nicolas Jan 27 '21 at 23:52
28

It's by design, C++ standard says char, signed char and unsigned char are different types. I think you can use static cast for transformation.

Sergi0
  • 1,084
  • 14
  • 28
9

Indeed, the Standard is precisely telling that char, signed char and unsigned char are 3 different types. A char is usually 8 bits but this is not imposed by the standard. An 8-bit number can encode 256 unique values; the difference is only in how those 256 unique values are interpreted. If you consider a 8 bit value as a signed binary value, it can represent integer values from -128 (coded 80H) to +127. If you consider it unsigned, it can represent values 0 to 255. By the C++ standard, a signed char is guaranteed to be able to hold values -127 to 127 (not -128!), whereas a unsigned char is able to hold values 0 to 255.

When converting a char to an int, the result is implementation defined! the result may e.g. be -55 or 201 according to the machine implementation of the single char 'É' (ISO 8859-1). Indeed, a CPU holding the char in a word (16bits) can either store FFC9 or 00C9 or C900, or even C9FF (in big and little endian representations). Explicit casts to signed or unsigned char do guarantee the char to int conversion outcome.

Bernard Hauzeur
  • 2,317
  • 1
  • 18
  • 25
  • I think all 11111111 (0xFF) stands for -1 as for signed char, not -128. I tried on VS. – Rick Mar 21 '18 at 23:57
  • 1
    thank you for pointing me to this horrible mistake. now fixed in the post. Indeed -128 is 80H and not FFH which is -1... its easy indeed to find the binary representation of a negative value. for 8 bits, just complement it 256, (for n bits, complement it 2 exp n) e.g. for -1: 256 - 1 = 255 = FFH. for -5: 256 -5 = 251 = FBH, and -128 yields 256 - 128 = 128 = 80H ... one can play with the old Windows Calculator set in programmers' mode. – Bernard Hauzeur Mar 23 '18 at 09:07
  • Sadly some implementations chose the less intuitive default for `char`, treating characters as -128 to 127 instead of 0 to 255, even though no ANSI code chart ever used negative numbers (nor any Unicode ones). This leads to exploits and access violations when people use characters as indices into arrays, because a character like 'Ä' is treated as -42. – Dwayne Robinson Apr 29 '21 at 03:44
3

Adding more info about the range: Since c++ 20, -128 value is also guaranteed for signed char: P1236R0: Alternative Wording for P0907R4 Signed Integers are Two's Complement

For each value x of a signed integer type, there is a unique value y of the corresponding unsigned integer type such that x is congruent to y modulo 2N, and vice versa; each such x and y have the same representation.

[ Footnote: This is also known as two's complement representation. ].
[ Example: The value -1 of a signed type is congruent to the value 2N-1 of the corresponding unsigned type; the representations are the same for these values. ]

The minimum value required to be supported by the implementation for the range exponent of each signed integer type is specified in table X.

I kindly and painfully (since SO does not support markdown for table) rewrote table x below :

╔═════════════╦════════════════════════════╗  
║ Type        ║ Minimum range exponent N   ║  
╠═════════════╬════════════════════════════╣  
║ signed char ║        8                   ║  
║ short       ║       16                   ║  
║ int         ║       16                   ║  
║ long        ║       32                   ║  
║ long long   ║       64                   ║  
╚═════════════╩════════════════════════════╝  

Hence, as a signed char has 8 bits: -2ⁿ⁻¹ to 2ⁿ⁻¹-1 (n equal to 8).

Guaranteed range is from -128 to 127. Hence, when it comes to range, there is no more difference between char and signed char.


About Cadoiz's comment: There is what the standard says, and there is the reality.
Reality check with below program:

#include <stdio.h>

int main(void) {
    char c = -128;
    printf("%d\n", (int)c);
    printf("%d\n", (int)--c);
    return 0;
}

Output:

-128
127

I would also say that signed char would help fellow programmers and also potentially the compiler to understand that you will use char's value to perform pointer's arithmetic.

Antonin GAVREL
  • 9,682
  • 8
  • 54
  • 81
  • Just for your concern: Tables are now supported - unfortunately, I wasn't able to take the freedom for an edit, the queue is full. https://meta.stackexchange.com/q/356997/390859 Attention, signed char only ranges from -127, not -128 to 127 - consider the other answers. – Cadoiz Mar 24 '21 at 01:17
  • Tables are supported since November 2020, my answer is from April. And to affirm that signed char only ranges from -127 to 127 you need some evidence. Do you have any? I do, see my edit. – Antonin GAVREL Mar 24 '21 at 06:02
  • Both was not meant offensively, just as a hint - plus you ommited the keyword `signed` before `char`. On which platform did you test the code? My evidence is limited to my own experience with MS VS for x86 and x64, but I didn't try the most recent one. I absolutely agree with your opinion on preferring `(un)signed char` over `char` for arithmetics (or probably everything besides actual characters like `'a'`=. – Cadoiz Mar 24 '21 at 19:55
  • I know I don't mean to be offensive as well, my point is: please show the output of trying the above program on your OS and show me the resulting output. The code was tested on Ubuntu (18.04) – Antonin GAVREL Mar 24 '21 at 19:57
  • Just a small note: https://wg21.cmeerw.net/cwg/issue1759 I will post real evidence as soon as I have time for that. – Cadoiz Mar 24 '21 at 20:07
  • I am looking forward to it :) – Antonin GAVREL Mar 24 '21 at 20:54