Why does 32 bit compiler and 64 bit compiler makes such a difference with my code?

Question

Excuse my bad English.

I have written down some lines to return max, min, sum of all values, and arrange all values in ascending order when five integers are input.

While writing, I mistakenly wrote 'num[4]' when I declared a INT array when I needed to put in 5 integers. But as I compiled with TDM-GCC 4.9.2 64-bit release, it worked without any problem. As soon as I realized and changed to TDM-GCC 4.9.2 32-bit release, it did not.

This is my whole code;

#include<stdio.h>

int main()
{

    int num[4],i,j,k,a,b,c,m,number,sum=0;
    printf("This program returns max, min, sum of all values, and arranges all values in ascending order when five integers are input.\n");
    printf("Please enter five integers.\n");

    for(i=0;i<5;i++)
    {
        printf("Enter #%d\n",i+1);
        scanf("%d",&num[i]);
    }

    //arrange all values
    for(j=0;j<5;j++)
    {
        for(k=j+1;k<5;k++)
        {
            if(num[j]>num[k])
            {
                number=num[j];
                num[j]=num[k];
                num[k]=number;
            }
        }
    }

    //find maximum value 
    int max=num[0];
    for(a=1;a<5;a++)
    {
        if(max<num[a]) 
        {
            max=num[a];
        }
    }   

    //find minimum value
    int min=num[0];
    for(b=1;b<5;b++)
    {
        if(min>num[b])  
        {
            min=num[b];
        }
    }

    //find sum of all values
    for(c=0;c<5;c++)
    {
        sum=sum+num[c]; 
    } 

    printf("Max Value : %d\n",max);//print max
    printf("Min Value : %d\n",min);//print min 
    printf("Sum : %d\n",sum); //print sum

    printf("In ascending order : "); //print all values in ascending order
    for(m=0;m<5;m++)
    {
        printf("%d ",num[m]);
    }
}

I am new to C and all kinds of programming, and don't know how to search these kind of problems. I know my way of asking like this here is very inappropriate, and I sincerely apologize to people who are irritated by these types of questioning posts. But this is my best try, so please don't blame, but I'm willing to accept any kind of advice or tips.

Thank you.

Is your real question, "why did this bug surface in 32 bit compiler and not 64 bit compiler" ? — erik258, Nov 16 '19 at 17:04
What inputs are you giving? And doesn the program fail for any inputs? — srccode, Nov 16 '19 at 17:08
Normally I would say that the difference between 64-bit and 32-bit compilation is the instruction set that is used. But in this case it probably also effects the way the compiler allocates memory in chunks of 32 / 64 bits. With 32-bit compilation things may well be packed closer together, with insufficient spare (allocated but not used) to accommodate the extra item in `num`, with 64-bit compilation there is some more slack and the extra data just happens to fall into an unused area and does not cause an exception or error. — pjaj, Nov 16 '19 at 17:14

tay10r · Answer 1 · 2019-11-16T17:38:44.923

When allocating on the stack, GCC targeting 64-bit (and probably Clang) will align stack allocations to 8 bytes.

For 32-bit targets, it's only going to use 4 bytes of padding.

So when you compiled your program for 64-bit, an extra four bytes was used to pad the stack. That's why when you accessed that last integer, it didn't segfault.

To see this in action, we'll create a test file.

void test_func() {
    int n[4];
    int b = 11;
    for (int i = 0; i < 4; i++) {
      n[i] = b;
    }
}

And we'll compile it for 32-bit and 64-bit.

gcc -g -c -m64 test.c -o test_64.o
gcc -g -c -m32 test.c -o test_32.o

And now we'll print the disassembly for each.

objdump -S test_64.o >test_64_dis.txt
objdump -S test_32.o >test_32_dis.txt

Here's the contents of the 64-bit version.

test_64.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <func>:
void func() {
   0:   f3 0f 1e fa             endbr64 
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 83 ec 30             sub    $0x30,%rsp
   c:   64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
  13:   00 00 
  15:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  19:   31 c0                   xor    %eax,%eax
    int n[4];
    int b = 11;
  1b:   c7 45 dc 0b 00 00 00    movl   $0xb,-0x24(%rbp)
    for (int i = 0; i < 4; i++) {
  22:   c7 45 d8 00 00 00 00    movl   $0x0,-0x28(%rbp)
  29:   eb 10                   jmp    3b <func+0x3b>
        n[i] = b;
  2b:   8b 45 d8                mov    -0x28(%rbp),%eax
  2e:   48 98                   cltq   
  30:   8b 55 dc                mov    -0x24(%rbp),%edx
  33:   89 54 85 e0             mov    %edx,-0x20(%rbp,%rax,4)
    for (int i = 0; i < 4; i++) {
  37:   83 45 d8 01             addl   $0x1,-0x28(%rbp)
  3b:   83 7d d8 03             cmpl   $0x3,-0x28(%rbp)
  3f:   7e ea                   jle    2b <func+0x2b>
    }
}
  41:   90                      nop
  42:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  46:   64 48 33 04 25 28 00    xor    %fs:0x28,%rax
  4d:   00 00 
  4f:   74 05                   je     56 <func+0x56>
  51:   e8 00 00 00 00          callq  56 <func+0x56>
  56:   c9                      leaveq 
  57:   c3                      retq

Here's the 32-bit version.

test_32.o:     file format elf32-i386


Disassembly of section .text:

00000000 <func>:
void func() {
   0:   f3 0f 1e fb             endbr32 
   4:   55                      push   %ebp
   5:   89 e5                   mov    %esp,%ebp
   7:   83 ec 28                sub    $0x28,%esp
   a:   e8 fc ff ff ff          call   b <func+0xb>
   f:   05 01 00 00 00          add    $0x1,%eax
  14:   65 a1 14 00 00 00       mov    %gs:0x14,%eax
  1a:   89 45 f4                mov    %eax,-0xc(%ebp)
  1d:   31 c0                   xor    %eax,%eax
    int n[4];
    int b = 11;
  1f:   c7 45 e0 0b 00 00 00    movl   $0xb,-0x20(%ebp)
    for (int i = 0; i < 4; i++) {
  26:   c7 45 dc 00 00 00 00    movl   $0x0,-0x24(%ebp)
  2d:   eb 0e                   jmp    3d <func+0x3d>
        n[i] = b;
  2f:   8b 45 dc                mov    -0x24(%ebp),%eax
  32:   8b 55 e0                mov    -0x20(%ebp),%edx
  35:   89 54 85 e4             mov    %edx,-0x1c(%ebp,%eax,4)
    for (int i = 0; i < 4; i++) {
  39:   83 45 dc 01             addl   $0x1,-0x24(%ebp)
  3d:   83 7d dc 03             cmpl   $0x3,-0x24(%ebp)
  41:   7e ec                   jle    2f <func+0x2f>
    }
}
  43:   90                      nop
  44:   8b 45 f4                mov    -0xc(%ebp),%eax
  47:   65 33 05 14 00 00 00    xor    %gs:0x14,%eax
  4e:   74 05                   je     55 <func+0x55>
  50:   e8 fc ff ff ff          call   51 <func+0x51>
  55:   c9                      leave  
  56:   c3                      ret    

Disassembly of section .text.__x86.get_pc_thunk.ax:

00000000 <__x86.get_pc_thunk.ax>:
   0:   8b 04 24                mov    (%esp),%eax
   3:   c3                      ret

You can see the compiler is generating 24 bytes and then 20 bytes respectively, if you look right after the variable declarations.

Regarding advice/tips you asked for, a good starting point would be to enable all compiler warnings and treat them as errors. In GCC and Clang, you'd use the -Wall -Wextra -Werror -Wfatal-errors.

I wouldn't recommend this if you're using the MSVC compiler, though, which often issues warnings about declarations from the header files it's distributed with.

score 3 · Answer 2 · answered Nov 16 '19 at 17:41

Other answers cover what might he actually happening, by analyzing the generated assembly, but the really relevant explanation is: Indexing out of array bounds is Undefined Behavior in C. And that's kinda the end of story.

UB means, the code is "allowed" to do anything by C standard. It could do different thing every time it is run. It could do what you want it to do with no ill effects. It might do what you want, but then something completely unrelated behaves in a funny way. Compiler, operating system, or even phase of the moon could make a difference. Or not.

It is generally not useful to think about what actually happens with Undefined Behavior at C level. You can of course produce the assembly output of a particular compilation, and inspect what it does, but that is result of that one compilation. A new compilation might change things (even if you just do new build at different time, because value of __TIME__ macro depends on time...).

Why does 32 bit compiler and 64 bit compiler makes such a difference with my code?

2 Answers2