15

The Standard specifies that hexadecimal constants like 0x8000 (larger than fits in a signed integer) are unsigned (just like octal constants), whereas decimal constants like 32768 are signed long. (The exact types assume a 16-bit integer and a 32-bit long.) However, in regular C environments both will have the same representation, in binary 1000 0000 0000 0000. Is a situation possible where this difference really produces a different outcome? In other words, is a situation possible where this difference matters at all?

Johan Bezem
  • 2,582
  • 1
  • 20
  • 47
  • I can imagine a situation where you tried to subtract an even larger number from `0x8000` and it didn't work expectedly because it's unsigned. But that's not really likely to happen. – Seth Carnegie Nov 09 '11 at 18:14
  • @user786653: Yes, exactly there, and in the table on the next page you have two columns differentiating between decimal constants and hexadecimal and octal constants, where the hex and octal constants (without suffix) also have the unsigned variants, in contrast to the decimal constants. (Comment removed; see http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf page 55f.) – Johan Bezem Nov 09 '11 at 18:22
  • 1
    @JohanBezem: Yeah, sorry I should have just edited my comment when I realized I was an idiot instead of deleting it. – user786653 Nov 09 '11 at 18:27

5 Answers5

8

Yes, it can matter. If your processor has a 16-bit int and a 32-bit long type, 32768 has the type long (since 32767 is the largest positive value fitting in a signed 16-bit int), whereas 0x8000 (since it is also considered for unsigned int) still fits in a 16-bit unsigned int.

Now consider the following program:

int main(int argc, char *argv[])
{
  volatile long long_dec = ((long)~32768);
  volatile long long_hex = ((long)~0x8000);

  return 0;
}

When 32768 is considered long, the negation will invert 32 bits, resulting in a representation 0xFFFF7FFF with type long; the cast is superfluous. When 0x8000 is considered unsigned int, the negation will invert 16 bits, resulting in a representation 0x7FFF with type unsigned int; the cast will then zero-extend to a long value of 0x00007FFF. Look at H&S5, section 2.7.1 page 24ff.

It is best to augment the constants with U, UL or L as appropriate.

Johan Bezem
  • 2,582
  • 1
  • 20
  • 47
  • 7
    `void main`? You gotta be kidding. –  Nov 09 '11 at 18:18
  • 4
    @Fanael: Sorry, yes, you are right. Thanks for the correction! +1 But in embedded systems we often use `void` since there's no environment to return the `int` to. – Johan Bezem Nov 09 '11 at 18:27
  • 1
    @OliCharlesworth: For a hosted implementation, `void main` is valid only if the implementation explicitly supports and documents it; otherwise it makes the program's behavior undefined. For freestanding implementations, the entry point is implementation-defined; again, it's up to the implementation to decide whether `void main` is valid. But `int main(void)` is *always* valid for hosted implementations, and there's no sane reason not to use it or `int main(int argc, char *argv)` or equivalent. – Keith Thompson Nov 09 '11 at 19:50
  • @KeithThompson True. But you missed the second asterisk: `char * * argv` ;-) – Johan Bezem Nov 09 '11 at 19:56
  • 1
    @JohanBezem: D'oh! But I actually meant to write `char *argv[]` (which, as a parameter, is equivalent to `char **argv`). – Keith Thompson Nov 09 '11 at 19:59
  • 1
    @Keith: Agreed. I just get irritated whenever someone takes the trouble to comment on `void main` when someone's used it in some random code snippet, like in this instance! – Oliver Charlesworth Nov 09 '11 at 21:06
  • 2
    @OliCharlesworth: My advice, seriously, is to get over your irritation. `void main` is a very common error -- and yes, in most contexts it *is* an error. It tends to appear in newbie questions, and newbies are usually using hosted implementations. Pointing it out is a good thing. (It usually indicates that the programmer learned from a bad book written by an author who should have known better.) – Keith Thompson Nov 09 '11 at 22:19
  • @KeithThompson And I don't consider myself a newbie anymore, using C since 1984... :-) – Johan Bezem Nov 09 '11 at 22:30
  • Change it to "void blah(void)" and avoid the issue altogether, while continuing to avoid the need for a superfluous return statement. – supercat Nov 20 '11 at 21:02
  • I've just edited the answer to refer to `int` rather than `integer`. `int` is one of several *integer* types. – Keith Thompson Nov 20 '11 at 22:46
  • @KeithThompson While the change of integer to int is valid if absolutely minor, the addition of formatting for every element possibly occurring in source code to show source code formatting is negatively impacting readability IMHO, because of the gray background color. – Johan Bezem Nov 21 '11 at 06:18
  • 1
    @JohanBezem: The change of integer to int is vital, in my opinion. The word "integer" refers to all the types from char to long long; "int" is the specific type you were actually referring to. As for the formatting, I like it better with than without, but I see your point, and if you want to change it back I won't mind. – Keith Thompson Nov 21 '11 at 06:39
  • I reversed the formatting for constants, but left it alone for the C types. For me, 'vital' is too strong, even if I basically agree with you. If we're talking about a "processor with a 16-bit integer", no one will assume its `char` type - nor its `long` type for that matter - to be 16-bit... Would you? – Johan Bezem Nov 21 '11 at 07:13
1

On a 32 bit platform with 64 bit long, a and b in the following code will have different values:

int x = 2;
long a = x * 0x80000000; /* multiplication done in unsigned -> 0           */
long b = x * 2147483648; /* multiplication done in long     -> 0x100000000 */
undur_gongor
  • 15,657
  • 5
  • 63
  • 75
  • 1
    Incidentally, there's a nasty gotcha related to the above: if `long x;`, how does the effect of `x &= ~0x80000000;` compare to that of `x &= ~0x100000000;` or `x &= ~0x40000000;`? – supercat Jun 27 '12 at 22:43
1

Another examine not yet given: compare (with greater-than or less-than operators) -1 to both 32768 and to 0x8000. Or, for that matter, try comparing each of them for equality with an 'int' variable equal to -32768.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Be aware that `-1`is a constant 'one' with a unary minus. The usual unary and binary conversions are applied, so you will not see any curious behavior in any of your cases IMHO. – Johan Bezem Nov 21 '11 at 06:58
  • @JohanBezem: It is a constant one with unary minus, but the result of the unary minus will be of type (signed) int. Comparing -1 to 32768 will yield a signed comparison between the long value -1 and the long value 32768. Comparing -1 to 0x8000 will yield an unsigned comparison between 0xFFFF and 0x8000. It's not possible to define an int literal equal -32768, which is why I specified a variable. – supercat Nov 21 '11 at 09:05
  • Oh, I see. Yes, you are right, but the necessary conversions outside the range of the target type are technically undefined AFAIK (H&S5 6.2.3), even if most/all practical implementations will use twos-complement and produce 'curious' results. Thanks! – Johan Bezem Nov 21 '11 at 10:01
  • @JohanBezem: What conversions would be undefined? Conversion of -1 to `unsigned int` is required to give the `unsigned int` value which, when added to 1, will yield zero. Likewise if an implementation allows for the existence of an `int` variable equal to -32768, converting that to `unsigned int` must yield the `unsigned int` value which, when added to 32768, will yield zero. Two's-complement has *nothing* to do with it. – supercat Jul 29 '15 at 19:10
1

Assuming int is 16 bits and long is 32 bits (which is actually fairly unusual these days; int is more commonly 32 bits):

printf("%ld\n", 32768);  // prints "32768"
printf("%ld\n", 0x8000); // has undefined behavior

In most contexts, a numeric expression will be implicitly converted to an appropriate type determined by the context. (That's not always the type you want, though.) This doesn't apply to non-fixed arguments to variadic functions, such as any argument to one of the *printf() functions following the format string.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • There are scores of embedded processors with 16-bit int's being in active development use today. That being that, your first `printf` uses a long integer constant printed as a long decimal value, no problem (as you indicated); IMHO the second `printf` also is no problem, since a conversion from `unsigned int` to `signed long` is defined and value-conserving. So in my opinion, no "undefined behavior" in the second expression. – Johan Bezem Nov 21 '11 at 06:55
  • 2
    @JohanBezem: Good point about embedded processors. Variadic arguments (i.e., arguments corresponding to the `, ...` in the function's declaration) are not converted, other than the default argument promotions, because the compiler doesn't know that a `long int` is expected. For a `printf` call, if the promoted argument type doesn't match the type specified by the format, the behavior is undefined; C99 7.19.6.1p9 says this explicitly. – Keith Thompson Nov 21 '11 at 07:09
0

The difference would be if you were to try and add a value to the 16 bit int it would not be able to do so because it would exceed the bounds of the variable whereas if you were using a 32bit long you could add any number that is less than 2^16 to it.

yoyomommy
  • 127
  • 1
  • 1
  • 7