15

I was working on an embedded project when I ran into something which I thought was strange behaviour. I managed to reproduce it on codepad (see below) to confirm, but don't have any other C compilers on my machine to try it on them.

Scenario: I have a #define for the most negative value a 32-bit integer can hold, and then I try to use this to compare with a floating point value as shown below:

#define INT32_MIN (-2147483648L)

void main()
{
    float myNumber = 0.0f;
    if(myNumber > INT32_MIN)
    {
        printf("Everything is OK");
    }
    else
    {
        printf("The universe is broken!!");
    }
}

Codepad link: http://codepad.org/cBneMZL5

To me it looks as though this this code should work fine, but to my surprise it prints out The universe is broken!!.

This code implicitly casts the INT32_MIN to a float, but it turns out that this results in a floating point value of 2147483648.0 (positive!), even though the floating point type is perfectly capable of representing -2147483648.0.

Does anyone have any insights into the cause of this behaviour?

CODE SOLUTION: As Steve Jessop mentioned in his answer, limits.h and stdint.h contain correct (working) int range defines already, so I'm now using these instead of my own #define

PROBLEM/SOLUTION EXPLANATION SUMMARY: Given the answers and discussions, I think this is a good summary of what's going on (note: still read the answers/comments because they provide a more detailed explanation):

  • I'm using a C89 compiler with 32-bit longs, so any values greater than LONG_MAX and less or equal to ULONG_MAX followed by the L postfix have a type of unsigned long
  • (-2147483648L) is actually a unary - on an unsigned long (see previous point) value: -(2147483648L). This negation operation 'wraps' the value around to be the unsigned long value of 2147483648 (because 32-bit unsigned longs have the range 0 - 4294967295).
  • This unsigned long number looks like the expected negative int value when it gets printed as an int or passed to a function because it is first getting cast to an int, which is wrapping this out-of-range 2147483648 around to -2147483648 (because 32-bit ints have the range -2147483648 to 2147483647)
  • The cast to float, however, is using the actual unsigned long value 2147483648 for conversion, resulting in the floating-point value of 2147483648.0.
avanzal
  • 153
  • 1
  • 6
  • fwiw, the universe isn't (as) broken if you use gcc 4.2.1. (So you're probably right about the compiler.) – Turix Jul 18 '12 at 07:38
  • are you on a 32 bit system? note that - is an operator. it is not part of the integer literal. – Johannes Schaub - litb Jul 18 '12 at 07:39
  • Can't reproduce with gcc 4.2.1. What compiler are you using? – Keith Flower Jul 18 '12 at 07:40
  • I can reproduce this with gcc 3.1. Yeah, I know that's a bit old :-) With gcc 4.6 on a 32-bit system `gcc -std=c90` produces the observed behaviour, `gcc -std=c99` works "as expected". – Philip Kendall Jul 18 '12 at 07:40
  • See also this question http://stackoverflow.com/questions/9941261/warning-this-decimal-constant-is-unsigned-only-in-iso-c90 which contains some standards discussion. – Philip Kendall Jul 18 '12 at 07:45
  • I assume that you added the `L` suffix to your constant specifically to avoid the problem described in the answers. However, on your platform `int` and `long` have the same size, which is why the overflow persists. You can add `LL` suffix instead and the overflow should go away (assuming `long long` has larger range) – AnT stands with Russia Jul 18 '12 at 17:11
  • It's `int main(void)`, *not* `void main()`. – Keith Thompson Jul 20 '12 at 01:50
  • Slight correction to your explanation: it's not true that *any* values with `L` suffix are `unsigned long`, it's specifically values greater than `LONG_MAX` and less or equal to `ULONG_MAX`. – Steve Jessop Jul 20 '12 at 08:08

2 Answers2

13

Replace

#define INT32_MIN (-2147483648L)

with

#define INT32_MIN (-2147483647 - 1)

-2147483648 is interpreted by the compiler to be the negation of 2147483648, which causes overflow on an int. So you should write (-2147483647 - 1) instead.
This is all C89 standard though. See Steve Jessop's answer for C99.
Also long is typically 32 bits on 32-bit machines, and 64 bits on 64-bit machines. int here gets the things done.

WiSaGaN
  • 46,887
  • 10
  • 54
  • 88
  • 4
    More specifically, writing -2147483648 causes the compiler to evaluate it as -(2147483648). The literal in brackets overflows, resulting in -(-2147483648) = 2147483648. The universe then shatters. – Thomas Jul 18 '12 at 07:46
  • 6
    The moral is that there are no negative integral literals, only (unsigned) integral literals and unary minuses. – Kerrek SB Jul 18 '12 at 07:49
  • 1
    arguing solely on the bitpatterns on two complement, it should end up as negative. probably an optimization pass interferes here. – Johannes Schaub - litb Jul 18 '12 at 07:49
  • @KerrekSB Exactly. I haven't checked the C standard. I suppose there should be something explicitly saying this. – WiSaGaN Jul 18 '12 at 07:51
  • 1
    Be careful about generalizations on the type `long` - it's not always 64-bits on 64-bit systems (Win64 for example). – Michael Burr Jul 18 '12 at 08:28
  • 1
    @MichaelBurr You are right. It's the typical case, not a sure thing. I'll edit it later. – WiSaGaN Jul 18 '12 at 08:37
  • Ah yes this solves the problem. I didn't even consider that this could have been the case, mainly because I did things like assign INT32_MIN to a 32-bit int variable, pass INT32_MIN to a function, and print INT32_MIN all without any issues. What would cause this difference in behaviour? – avanzal Jul 19 '12 at 00:10
  • @WiSaGaN The code to demonstrate can be found at: [http://codepad.org/JkRoyKRe](http://codepad.org/JkRoyKRe). My compiler (TI CGT v3.2.2 for MSP430) seems to behave the same way as codepad in this situation. – avanzal Jul 19 '12 at 06:58
  • @avanzal That's interesting. If you define `int a = INT32_MIN;`, then pass `a` to where it would previously fail, namely `printf("As float (straight): %f (FAIL)\n", (float)a);`, you would get correct result. It seems there's weird things the compiler are doing. – WiSaGaN Jul 19 '12 at 07:35
  • 1
    @WiSaGaN: it's not that weird: if you assign the out-of-range positive value `2147483648` to `int` then the result is implementation-defined, but it's not uncommon to see it wraparound to `-2147483648`. – Steve Jessop Jul 19 '12 at 08:41
  • @SteveJessop Very interesting, the wraparound was the last piece of information I needed for this to 'click' in my head. I think I understand what's going on now, so thanks to everyone! I've added a summary to my original post. I think it's accurate, but feel free to edit it or let me know if it's not. – avanzal Jul 20 '12 at 01:19
  • @WiSaGaN: A lot of code expects `unsigned long` to wrap at 2^32 and will behave very badly if it does not do so. The fundamental problem is that C has consistently failed to provide unsigned types with predictable wrapping behavior (it still doesn't have them, since a standards-compliant compiler could legitimately have a 64-bit `int` type but include a `uint32_t` type; on such a compiler, multiplying together two `uint32_t` values could turn the CPU into a heap of molten slag). – supercat Sep 24 '14 at 17:24
12

In C89 with a 32 bit long, 2147483648L has type unsigned long int (see 3.1.3.2 Integer constants). So once modulo arithmetic has been applied to the unary minus operation, INT32_MIN is the positive value 2147483648 with type unsigned long.

In C99, 2147483648L has type long if long is bigger than 32 bits, or long long otherwise (see 6.4.4.1 Integer constants). So there is no problem and INT32_MIN is the negative value -2147483648 with type long or long long.

Similarly in C89 with long larger than 32 bits, 2147483648L has type long and INT32_MIN is negative.

I guess you're using a C89 compiler with a 32 bit long.

One way to look at it is that C99 fixes a "mistake" in C89. In C99 a decimal literal with no U suffix always has signed type, whereas in C89 it may be signed or unsigned depending on its value.

What you should probably do, btw, is include limits.h and use INT_MIN for the minimum value of an int, and LONG_MIN for the minimum value of a long. They have the correct value and the expected type (INT_MIN is an int, LONG_MIN is a long). If you need an exact 32 bit type then (assuming your implementation is 2's complement):

  • for code that doesn't have to be portable, you could use whichever type you prefer that's the correct size, and assert it to be on the safe side.
  • for code that has to be portable, search for a version of the C99 header stdint.h that works on your C89 compiler, and use int32_t and INT32_MIN from that.
  • if all else fails, write stdint.h yourself, and use the expression in WiSaGaN's answer. It has type int if int is at least 32 bits, otherwise long.
Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • I didn't know that. Thanks for providing the standard. – WiSaGaN Jul 18 '12 at 08:38
  • Small nit, it might be better to explicitly state that `In C99, 2147483648L has type long long` also assumes 32-bit `long`. – Daniel Fischer Jul 18 '12 at 09:05
  • Interestingly and informatively, [limits.h](http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/i386/limits.h) (first online source I could find...) uses the `#define INT_MIN (-2147483647 - 1)` mechanism [recommended by WiSaGaN](http://stackoverflow.com/a/11536486/325514). – Kevin Vermeer Jul 18 '12 at 19:06
  • Great info, thanks for that it was very informative. I am using the TI CGT v3.2.2 for MSP430 (embedded project), but I'm not really sure if this is C89 or something else. I'm now using the limits.h header for my defines so thanks for that tip too. As I said in my comment to the other answer, I completely missed that this could be the case because I was using this define without problem (assigning it to variables, passing it to functions, printing it out, etc) until I tried to convert it to a float. Thanks for your help! – avanzal Jul 19 '12 at 00:15