Reading signed char using %u

Question

#include <stdio.h>

int main() {
    int i,n;
    int a = 123456789;

    void *v = &a;

    unsigned char *c = (unsigned char*)v;

    for(i=0;i< sizeof a;i++) {
        printf("%u  ",*(c+i));
    }

    char *cc = (char*)v;
    printf("\n %d", *(cc+1));

    char *ccc = (char*)v;
    printf("\n %u \n", *(ccc+1));

}

This program generates the following output on my 32 bit Ubuntu machine.

21  205  91  7  
-51
4294967245

First two lines of output I can understand =>

1st Line : sequence of storing of bytes in memory.
2nd Line : signed value of the second byte value (2's complement).
3rd Line : why such a large value ?

please explain the last line of output. WHY three bytes of 1's are added because (11111111111111111111111111001101) = 4294967245 .

because `%u` is for unsigned integer and your `*(ccc+1)` is a negative char. Unsigned integer size is longer than Char size, so there is padding. As your char is negative, there is `1` padding, as there would be `0` padding for positive char. — jhamon, Apr 01 '16 at 07:35

Lundin · Accepted Answer · 2016-04-01T07:54:14.670

12

Apparently your compiler uses signed characters and it is a little endian, two's complement system.

123456789d = 075BCD15h
Little endian: 15 CD 5B 07

Thus v+1 gives value 0xCD. When this is stored in a signed char, you get -51 in signed decimal format.

When passed to printf, the character *(ccc+1) containing value -51 first gets implicitly type promoted to int, because variadic functions like printf has a rule stating that all small integer parameters will get promoted to int (the default argument promotions). During this promotion, the sign is preserved. You still have value -51, but for a 32 bit signed integer, this gives the value 0xFFFFFFCD.

And finally the %u specifier tells printf to treat this as an unsigned integer, so you end up with 4.29 bil something.

The important part to understand here is that %u has nothing to do with the actual type promotion, it just tells printf how to interpret the data after the promotion.

edited Apr 01 '16 at 07:54

answered Apr 01 '16 at 07:45

Lundin

195,001
40
254
396

1

"The important part to understand here is that %u has nothing to do with the actual type promotion, it just tells printf how to interpret the data after the promotion." - Nice explanation :) Thanks – Debashish Apr 01 '16 at 08:03
According to 7.21.6.1 passing an invalid type is undefined. char requires hhd and unsigned char hhu. As far as I understand the text it, the argument is the type before the promotion, because of the text in 7.21.6.1,p7: *the argument will have been promoted according to the integer promotions* I don't think it is valid to print an unsigned char with d and rely on integer promotions. What do you think? – 2501 Apr 01 '16 at 10:25
@2501 I think the C standard is unclear. I don't understand why fprintf should dictate some special case of undefined behavior, which would somehow overwrite the well-defined behavior of argument promotion of variable-argument list parameters. Obviously if I do something completely weird, like using `%d` while passing a float, there will be undefined behavior. Why there would be undefined behavior when mixing different, otherwise compatible integer types is less obvious. – Lundin Apr 01 '16 at 10:56
@Lundin Well 7.21.6.1,p9 dictates ub. One might not like it but the text is there. I agree it is unclear. If after promotions we still have an argument then it is not ub. – 2501 Apr 01 '16 at 11:14
@2501 I suppose part of the issue is that specifiers for smaller integer types must still yield well-defined behavior. For example `uint8_t x; printf("%" PRIu8, x)` is well-defined, even though technically `x` will always get promoted to `int`. I've always thought of it as "if you use the correct format specifier, then getting the result right is no longer the responsibility of the programmer". What then happens in between the lines in forms of implicit promotions, is the compiler's problem to deal with. – Lundin Apr 01 '16 at 11:32
@2501 [fyi](http://stackoverflow.com/questions/27547377/format-specifier-for-unsigned-char) – Giorgi Moniava Apr 01 '16 at 13:04
@Lundin: The problem with passing an `int` to `printf("%u")` is that varargs-functions do not convert their arguments to the type of the parameter like normal functions (with prototypes) do. There is a special exemption for passing a `int` in place of an `unsigned` and vice versa, but *only* if the value is representable in both types. – EOF Apr 01 '16 at 13:25
@Lundin I think only using `hh` is correct. This is because unsigned char is not necessarily promoted to int, it could be promoted to unsigned int. So using `d` will cause ub on those architectures. This might seem ridiculous, but it is possible. Integer ranks guarantee that width of a lower rank is always a subrange, but this only holds true for the same signedness. See: 6.2.5. p8. Therefore it is possible that `UCHAR_MAX > INT_MAX`, while also always `UCHAR_MAX <= UINT_MAX`, so unsigned char gets promoted to unsigned int. – 2501 Apr 01 '16 at 13:29
@2501 `unsigned char` will always be promoted to `int` and never to `unsigned int`, according to the integer promotion rule, since an `int` can always represent all values of an `unsigned char`. (Unless it's some theoretical, obscure machine where `sizeof(char) == sizeof(int)`. I think we can ignore that artificial scenario.) – Lundin Apr 01 '16 at 14:40
@Lundin You cannot ignore that if you want your code to be strictly conforming. – 2501 Apr 01 '16 at 14:40
@2501 Sure I can. If one is worried about compatibility with theoretical, extremely weird and useless computers, simply add `_Static_assert(sizeof(char) < sizeof(int), "Your computer is too weird, why did you pick an extremely weird computer for? That's your problem, not mine. Have fun rewriting all my code.");`. – Lundin Apr 01 '16 at 14:43

score 8 · Answer 2 · edited May 23 '17 at 11:46

-51 store in 8 bit hex is 0xCD. (Assuming 2s compliment binary system)

When you pass it to a variadic function like printf, default argument promotion takes place and char is promoted to int with representation 0xFFFFFFCD (for 4 byte int).

0xFFFFFFCD interpreted as int is -51 and interpreted as unsigned int is 4294967245.

Further reading: Default argument promotions in C function calls

please explain the last line of output. WHY three bytes of 1's are added

This is called sign extension. When a smaller signed number is assigned (converted) to larger number, its signed bit get's replicated to ensure it represents same number (for example in 1s and 2s compliment).

Bad printf format specifier
You are attempting to print a char with specifier "%u" which specifies unsigned [int]. Arguments which do not match the conversion specifier in printf is undefined behavior from 7.19.6.1 paragraph 9.

If a conversion specification is invalid, the behavior is undeﬁned. If any argument is not the correct type for the corresponding conversion speciﬁcation, the behavior is undeﬁned.

Use of char to store signed value
Also to ensure char contains signed value, explicitly use signed char as char may behave as signed char or unsigned char. (In latter case, output of your snippet may be 205 205). In gcc you can force char to behave as unsigned char with -funsigned-char option.

It's debatable if it is actually undefined behavior though. Because the argument after promotion is of type `int`, not `char`. Best is of course to avoid printf family of functions entirely, since they are so prone to UB. Variadic functions should also be avoided, for the same type safety reasons. — Lundin, Apr 01 '16 at 07:51
The compilation is not generating any warning regarding %u. And Thanks. — Debashish, Apr 01 '16 at 08:04

Reading signed char using %u

2 Answers2

Linked