15
#include <limits.h>
#include <stdio.h>
int main() {
    long ival = 0;
    printf("ival: %li, min: %i, max: %i, too big: %i, too small: %i\n",
           ival, INT_MIN, INT_MAX, ival > INT_MAX, ival < INT_MIN);
}

This gives the output:

ival: 0, min: -2147483648, max: 2147483647, too big: 0, too small: 1

How is that possible?

(I actually got hit by this problem/bug in CPython 2.7.3 in getargs.c:convertsimple. If you look up the code, in case 'i', there is the check ival < INT_MIN which was always true for me. See also the test case source with further references.)


Well, I tested a few different compilers now. GCC/Clang, compiled for x86 all return the expected (too small: 0). The unexpected output is from the Clang in the Xcode toolchain when compiled for armv7.


If you want to reproduce:

This is the exact compile command: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -arch armv7 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.1.sdk test-int.c

This is Xcode 4.3.2.

I copied the resulting a.out over to my iPhone and executed it.

If anyone is interested in the assembler code generated by this:

    .section    __TEXT,__text,regular,pure_instructions
    .section    __TEXT,__textcoal_nt,coalesced,pure_instructions
    .section    __TEXT,__const_coal,coalesced
    .section    __TEXT,__picsymbolstub4,symbol_stubs,none,16
    .section    __TEXT,__StaticInit,regular,pure_instructions
    .syntax unified
    .section    __TEXT,__text,regular,pure_instructions
    .globl  _main
    .align  2
    .code   16
    .thumb_func _main
_main:
    push    {r7, lr}
    mov r7, sp
    sub sp, #20
    movw    r0, #65535
    movt    r0, #32767
    movs    r1, #0
    movt    r1, #0
    str r1, [sp, #16]
    str r1, [sp, #12]
    ldr r1, [sp, #12]
    ldr r2, [sp, #12]
    cmp r2, r0
    movw    r0, #0
    it  gt
    movgt   r0, #1
    and r0, r0, #1
    ldr r2, [sp, #12]
    cmn.w   r2, #-2147483648
    movw    r2, #0
    it  lt
    movlt   r2, #1
    and r2, r2, #1
    mov r3, sp
    str r2, [r3, #4]
    str r0, [r3]
    mov.w   r2, #-2147483648
    mvn r3, #-2147483648
    movw    r0, :lower16:(L_.str-(LPC0_0+4))
    movt    r0, :upper16:(L_.str-(LPC0_0+4))
LPC0_0:
    add r0, pc
    blx _printf
    ldr r1, [sp, #16]
    str r0, [sp, #8]
    mov r0, r1
    add sp, #20
    pop {r7, pc}

    .section    __TEXT,__cstring,cstring_literals
L_.str:
    .asciz   "ival: %li, min: %i, max: %i, too big: %i, too small: %i\n"


.subsections_via_symbols
Albert
  • 65,406
  • 61
  • 242
  • 386
  • Might be a casting quirk? It is possible it is casting INT_MIN to long and not properly handling the sign? Or vice versa? O.o – TheZ Jun 20 '12 at 23:00
  • @Albert interesting, I get `ival: 0, min: -2147483648, max: 2147483647, too big: 0, too small: 0` – Samy Vilar Jun 20 '12 at 23:08
  • I don't think there is a %i format for printf. You probably want %d. And it is a good habit to explicitely cast the arguments for varargs functions like printf to int. (in this case not needed, because the value for (a>b) defaults to the type int) – wildplasser Jun 20 '12 at 23:15
  • 2
    @wildplasser: Yes, there is a `%i` format for `printf`. It is the same as `%d`. – Dietrich Epp Jun 20 '12 at 23:18
  • Oops. My bad. I thought is was a relict from a former extension. – wildplasser Jun 20 '12 at 23:20
  • Why all the upvotes? I can think of no way this could have happened, and if the question were by a new user, I would guess it was an attempt to fish for rep, but Albert seems credible... – R.. GitHub STOP HELPING ICE Jun 21 '12 at 01:05
  • @R..: Well, this looks interesting, I guess. I hesitated myself for a bit. Should I even ask it or is it obvious that this must be a compiler bug? I was almost certainly sure. But maybe I was overlooking some strange case. Other upvoters very probably just as curious about this as myself. -- Or are you putting it into question wether such a compiler bug can happen? All details to reproduce it are there... – Albert Jun 21 '12 at 01:45

2 Answers2

5

This is an error. There is no room in the C standard for too small to be anything other than 0. Here's how it works:

  1. Since INT_MIN is an int, it gets converted to long during the "usual arithmetic conversions". This happens because long has higher rank than int (and both are signed types). No promotions occur, since all of the operands have at least int rank. No undefined or implementation-specified behavior is invoked.

  2. During conversion, the value of INT_MIN is preserved. Since it is being converted from int to long, and it is guaranteed that long has at least the range of int, the value of INT_MIN must be preserved during the conversion. No undefined or implementation-specified behavior is invoked. No modular conversions are permitted, those are for unsigned types only.

  3. The result of the comparison should be 0.

There is no wiggle room for sign extension or other such things. Also, since the call to printf is correct, there is no problem there.

If you can reproduce it on another system, or send it to someone else who can reproduce it, you should report the bug directly to your toolchain vendor.

Attempts to reproduce the bug: I was not able to reproduce the behavior on any of the following combinations, all both with optimization on and off:

  • GCC 4.0, PPC + PPC64
  • GCC 4.2, PPC + PPC64
  • GCC 4.3, x64
  • GCC 4.4, x64
  • Clang 3.0, x64
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • 4
    I also only can reproduce it with the Clang from within the Xcode (4.3.2) toolchain for arch armv7 with -O0. With optimizations, I get the expected result. Seems like an actual bug. I reported it. – Albert Jun 20 '12 at 23:40
1

What does this print?

#include <limits.h>

printf("%016ld\n", LONG_MAX);

long l_int_min = (long)INT_MIN;
printf("%016lx\n", l_int_min);

I'm wondering whether INT_MIN is getting coerced to long without being sign-extended. That would make 0 smaller than the resulting value.

EDIT: okay, the result of the first printf() was 0000002147483647, which means that long is 32-bit on that platform, just like int. So casting an int to a long shouldn't actually change anything.

I try to reserve "it's a compiler bug" as a last resort, but this is looking like a compiler bug to me.

steveha
  • 74,789
  • 21
  • 92
  • 117
  • Converting `int` to `long` without preserving sign is an error, according to the C standard. No compliant C implementation may do such a thing. – Dietrich Epp Jun 20 '12 at 23:24
  • `0000002147483647` `0000000080000000` – Albert Jun 20 '12 at 23:26
  • The %16lx format expects an unsigned long int argument. The l_int_min is passed as a signed long. Its value was initialised using INT_MIN which was sign-extended before use. – wildplasser Jun 20 '12 at 23:30
  • Yes, the `%16lx` format expects an unsigned long. But because `printf()` isn't really type-safe, if you pass a signed long and print it with that format, I would expect it to print a sensible result. – steveha Jun 20 '12 at 23:42