We can use the Compiler Explorer at godbolt.org
to look at your example. We'll use the following testbench code:
int test() {
int a[15] = {0};
int b[15] = {0};
for (int i = 0; i < 15; i++){
a[b[i]]++;
}
return 0;
}
Godbolt shows the x86 assembly, not the LLVM bytecode, but I've summarized it a bit to show what's going on. Here it is at -O0 -m32
:
test():
# set up stack
.LBB0_1:
cmp dword ptr [ebp - 128], 15 # i < 15?
jge .LBB0_4 # no? then jump out of loop
mov eax, dword ptr [ebp - 128] # load i
mov eax, dword ptr [ebp + 4*eax - 124] # load b[i]
mov ecx, dword ptr [ebp + 4*eax - 64] # load a[b[i]]
add ecx, 1 # increment it
mov dword ptr [ebp + 4*eax - 64], ecx # store it back
mov eax, dword ptr [ebp - 128]
add eax, 1 # increment i
mov dword ptr [ebp - 128], eax
jmp .LBB0_1 # repeat
.LBB0_4:
# tear down stack
ret
This looks like we'd expect: the loop is clearly visible and it does all the steps we listed. If we compile at -O1 -m32 -march=i386
, we see the loop is still there but it's much simpler:
test(): # @test()
# set up stack
.LBB0_1:
mov ecx, dword ptr [esp + 4*eax] # load b[i]
inc dword ptr [esp + 4*ecx + 60] # increment a[b[i]]
inc eax # increment i
cmp eax, 15 # compare == 15
jne .LBB0_1 # no? then loop
# tear down stack
ret
Clang now uses the inc
instruction (useful), noticed it could use the eax
register for the loop counter i
(neat), and moved the condition check to the bottom of the loop (probably better). We can still recognize our original code, though. Now let's try with -O2 -m32 -march=i386
:
test():
xor eax, eax # does nothing
ret
That's it? Yes.
clang
has detected that the a
array can never be used outside of the function. This means doing the incrementing will never affect any other part of the program - and also that nobody will miss it when it's gone.
Removing the increment leaves a for
loop with an empty body and no side effects, which can also be removed. In turn, removing the loop leaves an (for all intents and purposes) empty function.
This empty function is likely what you were seeing in the LLVM bytecode (ret i32 0
).
This is not a very scientific description, and the steps clang
takes might be different, but I hope the example clears it up a bit. If you want, you can read up on the as-if rule. I also recommend playing around on https://godbolt.org/ for a bit: see what happens when you move a
and b
outside the function, for example.