10

I saw a piece of valid C code I tried to compile as C++ and I got an error I can't understand.

char* t;
signed char* v = t;

error: invalid conversion from char* to signed char*

From what I learned, char and signed char are semantically identical, but are still considered as different by the compiler.

I know that the error is caused by the difference between these two type, my question is: Why does this difference exists ?

As far as I know char is implemented either as a signed char or as a unsigned char so it should be identical to either one or the other.


I consulted this question and it doesn't answer the point I want to know.

Community
  • 1
  • 1
Jaffa
  • 12,442
  • 4
  • 49
  • 101
  • 1
    I like to think of `signed` and `unsigned char` as *arithmetic* types, just small integers, essentially, while `char` is the I/O type -- command line arguments, environment and read/write via files is done in terms of chars. – Kerrek SB Oct 18 '13 at 08:13
  • 1
    this question has already been answered at http://stackoverflow.com/questions/436513/char-signed-char-char-unsigned-char -- the most basic explanation is that `unsigned char` ranges from 0..255 and `signed char` ranges from -127..128. So you cannot convert a `signed char` of -42 to `unsigned char` or convert an `unsigned char` of 142 to `signed char`. `char` is usually read as `signed char`. – alle_meije Oct 18 '13 at 08:16
  • @alle_meije and `signed char` and `char` *are* equal. And I didn't saw this question although I searched for it... – Jaffa Oct 18 '13 at 08:17
  • That's not how I read `It is implementation-defined whether a char object can hold negative values.` – alle_meije Oct 18 '13 at 08:21
  • @alle_meije what I mean is that it's then either equal to `signed char` or to `unsigned char` depending on the implementation. But as the answer state it, it's a requirement of the specs that the types are different, not a difference between the types in themselves. – Jaffa Oct 18 '13 at 08:24
  • What I mean is that "it should be one or the other" means that your code will work on one compiler system and (possibly) not on another. Here "implementation" refers to the implementation of the compiler, on which your code should not depend. Its the same as implementing pointer NULL as the number 0. Most compilers do that but there is no guarantee. – alle_meije Oct 18 '13 at 08:30
  • @Geoffroy signed char and char are not GUARANTEED to be the same. There is no requirement for char to be signed, and this has always been the case. They may be signed _on your platform_. On _another_ platform a `char` might actually be `unsigned char`. And this isnt some ivory tower "in theory", this is a real world practical consideration. There **are** platforms which do it the other way. – mjs Oct 18 '13 at 09:35

3 Answers3

11

Actually I finally found the spec part talking about this:

3.9.1 Fundamental types

  1. Objects declared as characters (char) shall be large enough to store any member of the implementation’s basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.11); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.
Jaffa
  • 12,442
  • 4
  • 49
  • 101
0

From what I learned, char and signed char are semantically identical, but are still >considered as different by the compiler.

NO. char is not semantically identical to signed char.

In contrast to the other integral types (integer, long, short etc) there is no guarantee that char without a signed or unsigned will be signed. This is implementation defined. Some architectures define it as signed, others as unsigned in the real world

So, with a char, if the signedness is important, you really do need to specify which you want.

My recommendation would be if you are doing character manipulation etc, or using an api call that uses char or char *, use char. If you just want an 8-bit integer value, make sure you specify signed char or unsigned char, so that in a couple of years time when you port to a different architecture you don't get bitten in the bum.

Or better yet, use uint8_t or int8_t for 8 bit integers.

EDIT: From your own answer:

These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

mjs
  • 2,837
  • 4
  • 28
  • 48
-1

I will say what i know...

For char type c++ has size '1' byte..

if it is signed char then the range is from -128 to 127 else if it is unsigned char range is from 0 to 256

we all know the 8 bits in a byte in case of signed char the MSB(ie , the left most bit) will be used for the sign the rest 7 bits for values making range 0-2^7(0-127) .negative sign(logical 1) and (logical 0) for positive sign on MSB. eg( 1 0000111=-7,0 0000111=+7)and 1 0000000-128. However if assign 129 for a signed char value it will be automatically changed to -127 (ie a value in range (-128,127).

on the other case of unsigned char type all the 8 bits are used for values ie range is 0-2^8(0-255). here 0-127 is same as the signed char and the ones belongs to -128 to 0 can be found in range 128-255 in the unsigned char set.

So we can say and spot the internal memory difference between the two types 'signed' and 'unsigned' which may be the problem.

Raon
  • 1,266
  • 3
  • 12
  • 25
  • Thanks for your answer, it's all true but it was not the actual problem :) – Jaffa Oct 18 '13 at 09:13
  • @Raon: Actually the standard says: `sizeof(char) == 1` but doesn't establish that 1 => 1 Byte, but simply that `char` is the smallest allocatable block of memory your system supports. Some esoteric system may use `1KB` as smallest possible allocatable size… Also note that the standard does not specify the representation of signed types! – MFH Oct 18 '13 at 09:16
  • Initializing 'signed char *' with an expression of type 'char *' converts between pointers to integer types with different sign – Raon Oct 18 '13 at 09:18
  • @MFH no mate if it one char it is one ,the thing you're talking about is OS's memory allocation.. But we will have further memory managers for our program. – Raon Oct 18 '13 at 09:22
  • @Raon: 1 char != 1 byte, if you claim otherwise you may want to have a word with the standards committee (see §1.7.1). What the C++ standard call a "byte" is not necessarily what we normally call a byte… – MFH Oct 18 '13 at 09:45