203

In the book "Complete Reference of C" it is mentioned that char is by default unsigned.

But I am trying to verify this with GCC as well as Visual Studio. It is taking it as signed by default.

Which one is correct?

Paulo Mattos
  • 18,845
  • 10
  • 77
  • 85
C Learner
  • 2,287
  • 2
  • 14
  • 7
  • 8
    The one C reference book I trust is Harbison & Steele's "C: A Reference Manual" (http://www.careferencemanual.com/). Of course the standard is the final word, but it's not very readable and only gives the slightest information on pre-standard and common (ie., POSIX) uses that are outside the standard. Harbison & Steele is quite readable, detailed and probably more correct than most references. However it also isn't a tutorial, so if you're in the initial stages of learning it's probably not a great thing to jump into. – Michael Burr Jan 13 '10 at 07:02
  • 23
    I think the book you are reading is *C: The Complete Reference*, by Herbert Schildt. From a review of this book (http://www.accu.informika.ru/accu/bookreviews/public/reviews/c/c002173.htm): *I am not going to recommend this book (too many of you give too much weight to my opinions) but I do not think it deserves the same opprobrium that has been legitimately thrown at some of his other work.* As Michael says, a much better reference is *Harbison & Steele*. – Alok Singhal Jan 13 '10 at 07:14
  • 1
    My two cents here: Because `char` can be unsigned, as a rule of thumb use an `int` to read a value using `getchar()`, which might return `EOF`. `EOF` is usually defined as `-1` or other negative value, which storing in an `unsigned` is not what you want. Here's the declaration: `extern int getchar();` BTW, this recommendation comes also from "C: A Reference Manual" book. – Maxim Chetrusca Nov 03 '14 at 15:39
  • 9
    The one C reference I trust is ISO/IEC 9899:2011 :-) – Jeff Hammond Apr 06 '15 at 22:31
  • 5
    @MaxChetrusca good advice but bad rationale: even on the signed `char` case, you'd have to use `int` to store the return value. – Antti Haapala -- Слава Україні Feb 12 '16 at 08:21

6 Answers6

262

The book is wrong. The standard does not specify if plain char is signed or unsigned.

In fact, the standard defines three distinct types: char, signed char, and unsigned char. If you #include <limits.h> and then look at CHAR_MIN, you can find out if plain char is signed or unsigned (if CHAR_MIN is less than 0 or equal to 0), but even then, the three types are distinct as far as the standard is concerned.

Do note that char is special in this way. If you declare a variable as int it is 100% equivalent to declaring it as signed int. This is always true for all compilers and architectures.

klutt
  • 30,332
  • 17
  • 55
  • 95
Alok Singhal
  • 93,253
  • 21
  • 125
  • 158
  • 1
    @Alok: the same is not true for some other datatypes, for example `int` means `signed int` always, right? Apart from `char`, what other datatypes have the same confusion in `C`? – Lazer Mar 28 '10 at 11:15
  • 10
    @eSKay: yes, `char` is the only type that can be signed or unsigned. `int` is equivalent to `signed int` for example. – Alok Singhal Mar 29 '10 at 00:54
  • 44
    There is a hysterical, er, historical reason for this -- early in the life of C the "standard" was flip-flopped at least twice, and some popular early compilers ended up one way and others the other. – Hot Licks Nov 28 '12 at 01:59
  • 12
    @AlokSinghal: It's also implementatin-defined whether a bit field of type `int` is signed or unsigned. – Keith Thompson Apr 01 '14 at 04:39
  • @KeithThompson thanks for the correction. I tend to forget some details about bit field types since I don't use them much. – Alok Singhal Apr 01 '14 at 04:45
  • I wonder why ANSI has yet to define any standard means by which code can say things like "within this region, I want the compiler to either regard `char` as unsigned or refuse compilation if it can't do that"? I understand that the standard must allow for the existence of different dialects of C, but if there's no standard way to say that whether `0xFFFF+1` should yield `0u` or 65536, then I would posit that such an expression should be considered meaningless in "standard C". – supercat Mar 31 '15 at 20:07
  • 1
    @supercat: `#if CHAR_MIN < 0` ... `#error "Plain char is signed"` ... `#endif` – Keith Thompson Apr 09 '15 at 20:32
  • @KeithThompson: That would cause a program to refuse compilation on any compiler whose *default* behavior is to use signed characters, without regard for whether the compiler has a means of switching between signed and unsigned character types. – supercat Apr 09 '15 at 20:39
  • @KeithThompson: The way I would like to see such directives, the expectation would be that a good compiler should attempt to support any options the platform could support in practical fashion. The fact that a processor has 32-bit registers shouldn't make it impossible to run code which expects that 0xFFFF+1 equals 0, nor code that expects that 0xFFFFFFFF+1 equals 0x100000000, provided that code which expects such things is marked to let the compiler know of its expectations. – supercat Apr 10 '15 at 03:45
  • @ChrisChiasson: I would think it unlikely that even a compiler where character-signedness is configurable would be able to automatically configure itself based upon such a static assertion, much less confine such behavior to a designated region of the code. – supercat Jan 06 '17 at 15:26
  • @ChrisChiasson: If there were a defined directive which said "Within this region I want the compiler to either regard `char` as unsigned or refuse compilation of it can't", compilers would be allowed to treat such directives as static assertions, but could also use them to instead control behavior. If the term "implementation-defined" can stretch far enough to let a compiler define a sequence of configuration settings it will try in sequence to see if any will work, a compiler could use "static assert" as a configuration directive, but that seems *really* hokey and unlikely. – supercat Jan 06 '17 at 17:55
  • Unlike in C++, in C, they are not three distinct types, I believe. In C there are two distinct types: `signed char` and `unsigned char`; and `char` is just an alias for one of those. Standard just does not say which one. Therefore in C++ you need explicit casting between them, in C you do not. – mity Nov 21 '18 at 21:33
80

As Alok points out, the standard leaves that up to the implementation.

For gcc, the default is signed, but you can modify that with -funsigned-char. note: for gcc in Android NDK, the default is unsigned. You can also explicitly ask for signed characters with -fsigned-char.

On MSVC, the default is signed but you can modify that with /J.

Cristian Ciupitu
  • 20,270
  • 7
  • 50
  • 76
R Samuel Klatchko
  • 74,869
  • 16
  • 134
  • 187
  • 2
    Interesting that Schildt's description doesn't match MSVC's behavior since his books are usually geared toward MSVC users. I wonder if MS changed the default at some point? – Michael Burr Jan 13 '10 at 07:17
  • 2
    I thought it wasn't dependent on the compiler, but on the platform. I thought char was left as a third type of "character datatype" to conform to what the systems at that time used as printable characters. – Spidey May 09 '12 at 19:45
  • 15
    [GCC docs](https://gcc.gnu.org/onlinedocs/gcc-4.2.2/gcc/C-Dialect-Options.html) say it's machine-dependent: "*Each kind of machine has a default for what char should be. It is either like unsigned char by default or like signed char by default.*" – Deduplicator Sep 07 '15 at 16:16
  • 1
    Can you please provide a source for your note that on android the default is unsigned char? – phlipsy Oct 22 '15 at 06:36
  • 1
    @Spidey the C standard makes no real distinction between compilers, platforms and CPU architectures. It just lumps them all together under "implementation". – plugwash Nov 23 '16 at 18:48
  • 2
    @Deduplicator So, the part "_For gcc, the default is signed_" in this answer is wrong? – Spikatrix Mar 27 '17 at 11:51
  • 1
    With GCC, char is (typically) signed on x86/x86_64 but unsigned on ARM. – Blaisorblade Feb 23 '23 at 01:39
  • I found the NDK caveat the hard way, thanks! – Didi Kohen Apr 08 '23 at 12:34
40

C99 N1256 draft 6.2.5/15 "Types" has this to say about the signed-ness of type char:

The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.

and in a footnote:

CHAR_MIN, defined in <limits.h>, will have one of the values 0 or SCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made, char is a separate type from the other two and is not compatible with either.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
10

According to The C Programming Language book by Dennis Ritchie which is the de-facto standard book for ANSI C, plain chars either signed or unsigned are machine dependent, but printable characters are always positive.

stdcall
  • 27,613
  • 18
  • 81
  • 125
Ravi Rathi
  • 109
  • 1
  • 2
  • 13
    It's not necessarily the case that *printable* characters are always positive. The C standard guarantees that all members of the basic execution character set have non-negative values. – Keith Thompson Apr 01 '14 at 04:40
10

According to the C standard the signedness of plain char is "implementation defined".

In general implementors chose whichever was more efficient to implement on their architecture. On x86 systems char is generally signed. On arm systems it is generally unsigned (Apple iOS is an exception).

plugwash
  • 9,724
  • 2
  • 38
  • 51
  • 1
    [why unsigned types are more efficent in ARM?](http://stackoverflow.com/q/3093669/995714), [is char signed or unsigned by default on iOS?](http://stackoverflow.com/q/20576300/995714) – phuclv May 16 '17 at 08:38
  • 4
    @plugwash Your answer was probably downvoted because [Tim Post lost his keys](https://meta.stackexchange.com/a/215397/349538). Seriously though, you shouldn't worry about a single downvote as long as you're sure your answer is correct (which it is in this case). It's happened to me several times to have my posts downvoted for no valid reason. Don't worry about it, sometimes people just do odd things. – Donald Duck Oct 02 '17 at 10:54
  • 2
    Why is signed char more efficient on x86? Any sources? – martinkunev Mar 12 '19 at 17:48
  • 3
    @martinkunev Necropost but: I don’t think signed char is more efficient as such on x86, but it’s also not less efficient than unsigned. Reasons for picking it might also include consistency with other integer types defaulting to signed, and maybe signed types sometimes leading to better optimisation due to signed overflow being undefined behaviour (i.e., compiler can assume it won’t overflow). – Arkku Nov 25 '21 at 17:58
4

Now, we known the standard leaves that up to the implementation.

But how to check a type is signed or unsigned, such as char?

I wrote a macro to do this:

#define IS_UNSIGNED(t) ((t)~1 > 0)

and test it with gcc, clang, and cl. But I do not sure it's always safe for other cases.

南山竹
  • 484
  • 8
  • 15
  • 6
    What is wrong with usual CHAR_MIN < 0 (or WCHAR_MIN < 0 for wchar_t)? – Öö Tiib May 21 '20 at 07:52
  • This builds on the assumption that signed integers are represented in two's complement. Although, this almost always holds, some systems may use one's complement, where all bits set to one means negative zero, which equals positive zero, and your macro returns the wrong answer. – z32a7ul Feb 11 '23 at 19:55