41

Why is char by default in the range from -128 to 127 when it is supposed to represent a 'character' whose textual reprezentations are in the range from 0 to 255? In this sense I'd guess char should be unsigned by default, only if we intended to treat it only like 'numbers' we'd have to add 'signed' keyword. Therefore should I rather use unsigned char when I work with text files?

Also I don't understand why std::ofstream's read and write functions use char and not unsigned char when I need to work with binary files. There I don't care about signed-ness, do I? Moreover I've made successfuly a copy of a JPEG file using signed char like this:

//..open all streams..
char c;
while(input.peek()!=EOF){
    input.read(&c,1);   //std::ifstream input;
    output.write(&c,1); //std::ofstream output;
} 
//..close all streams..

Since it works I think the read reads an unsigned bytes (in image processing an unsigned char is commonly used) and sets c so that the value has some accidental signed interpretation in 2's complement. I need to create a histogram of values, but I get a runtime error because I use signed char as index. Isn't it rather stupid that I have to use some cast uc = (unsigned char)c;? when there could be at least a simple overload of read/write for unsigned char?

Daniel Katz
  • 2,271
  • 3
  • 25
  • 27
  • 18
    `char` is not always signed. And ASCII ends at 127, so it's fairly logical not to go past considering nearly all systems use it. – chris Jun 13 '13 at 21:39
  • +1 @chris, it's implementation dependent. – Carl Norum Jun 13 '13 at 21:39
  • Ou, I apparently missed notes about char in the documentation. On my pc the 'char' is signed by default and its char set includes some accented lettes, so it didn't make sense for me. – Daniel Katz Jun 13 '13 at 22:47
  • If its machine dependent how can I use 'write' function using char, would I be forced to use 'signed char' on other machine where char would be unsigned by default? Or would the char have a different meaning for the same function? – Daniel Katz Jun 13 '13 at 22:48
  • @DanielKatz: `char` is `char`. It is not `signed char` **or** `unsigned char` -- the three types are all distinct. –  Jun 13 '13 at 23:14
  • 3
    @DanielKatz Forget the terminology "by default". On implementations where char is signed, char and signed char are still two separate types. On implementations where char is unsigned, char and unsigned char are still two separate types. On _all_ implementations, char, signed char, and unsigned char, are always three separate types. This is different from the other integral types, where the terminology "signed by default" makes sense. – Oktalist Jun 13 '13 at 23:16
  • 1
    *"Why is 'char' signed by default in C++?"* - Easy answer: It isn't! – Christian Rau Jun 14 '13 at 08:40
  • If I remember correctly it was unsigned by default on Watcom, so when we moved code to Visual Studio we had quite a mess til we figured that out. – Retired Ninja Jun 14 '13 at 08:45
  • 1
    ASCII ends at 127, but we've had 8 bit characters on plenty of platforms since pretty ancient history - ISO8859, Windows code pages, DOS code pages, all those weird 16 bit machines in the 90s and 8 bit machines in the 80s. Sure C and C++ don't say that `char` is signed by default, and maybe some platforms somewhere have reason for preferring signed `char` by default, but I don't understand myself why signed seems always to have been the default for the compilers I've used. It's almost as if compiler writers just naturally hate 8 bit character sets. –  Jun 14 '13 at 08:49
  • Does this answer your question? [Why don't the C or C++ standards explicitly define char as signed or unsigned?](https://stackoverflow.com/questions/15533115/why-dont-the-c-or-c-standards-explicitly-define-char-as-signed-or-unsigned) – phuclv Dec 19 '19 at 00:17

2 Answers2

56

It isn't.

The signedness of a char that isn't either a signed char or unsigned char is implementation-defined. Many systems make it signed to match other types that are signed by default (like int), but it may be unsigned on some systems. (Say, if you pass -funsigned-char to GCC.)

  • 1
    @DanielKatz: It doesn't matter if it is on your compiler; it's *implementation-defined*. Yours defined it one way, another could use another way. – Nicol Bolas Jun 13 '13 at 22:29
  • 1
    `std::cout << std::is_same::value << "\n"; // false std::cout << std::is_same::value << "\n"; // false` – 8.8.8.8 Jul 25 '19 at 05:18
36

Here is your answer from the standard:

3.9.1 Fundamental types [basic.fundamental]

1 Objects declared as characters char) shall be large enough to store any member of the implementation's basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (basic.types); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

Community
  • 1
  • 1
Jean-Bernard Pellerin
  • 12,556
  • 10
  • 57
  • 79