I have a piece of C code that has an int
array - the code makes several reads to the array. When I compile it with GCC to X86 assembly using the -O0 flag, in the assembly all the read accesses to the array are made using the movl
instruction - a 32 bit load. This makes sense because int
s are 32 bits and so accesses to arrays of them should use 32 bit loads.
However, when I compile it using the -O3 flag, several of the 32 bit movl
reads to the array are replaced with 64 bit loads into the XMM registers instead... I assume this is some sort of optimization, but the optimized disassembly is very challenging to decipher and I'm a bit lost about what's going on.
Without going into too much detail about my work, I need to use the O3 flag, but I need all accesses to my 32 bit int array to use 32 bit accesses.
Does anyone have any insight into what could possibly going on and how I can enforce all loads to my array to be 32 bits while still using the -O3 flag?
Example to reproduce:
Here's the C code:
#include <stdlib.h>
int main() {
int* arr = malloc(sizeof(int) * 64);
int sum = 0;
for (int i = 0; i < 10; i++) {
sum += arr [i] + arr[i+1];
}
if (sum == 0)
return 0;
else
return 1;
}
For the unoptimized disassembly, compile with (note the 32 bit loads in the disassembly):
gcc -S -fverbose-asm -o mb64BitLoadsNoOpt.s mb64BitLoads.c
For the optimized disassembly, compile with (note the XMM register 64 bit loads in the disassembly):
gcc -O3 -S -fverbose-asm -o mb64BitLoadsOpt mb64BitLoads.c