I'm using a gcc compiler for 64 bit mips machine. I noticed something interesting for a piece of assembly code generated. below is detail:
00000001200a4348 <get_pa_txr_index+0x50> 2ca2001f sltiu v0,a1,31
00000001200a434c <get_pa_txr_index+0x54> 14400016 bnez v0,00000001200a43a8 <get_pa_txr_index+0xb0>
00000001200a4350 <get_pa_txr_index+0x58> 64a2000e daddiu v0,a1,14
00000001200a43a8 <get_pa_txr_index+0xb0> 000210f8 dsll v0,v0,0x3
00000001200a43ac <get_pa_txr_index+0xb4> 0062102d daddu v0,v1,v0
00000001200a43b0 <get_pa_txr_index+0xb8> dc440008 ld a0,8(v0)
00000001200a43b4 <get_pa_txr_index+0xbc> df9955c0 ld t9,21952(gp)
00000001200a43b8 <get_pa_txr_index+0xc0> 0320f809 jalr t9
00000001200a43bc <get_pa_txr_index+0xc4> 00000000 nop
normally the bnez will immediately jump to 0xb0. But in the block after 0xb0, what I'm sure is the program must use a1 as a parameter. But as we can see, a1 never showed up in the block after 0xb0.
But a1 is used in 0x58 which is right after the bnez (0x54).
So is it possible the 0x54 and 0x58 instruction get executed at the same time? A superscalar processor executes more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to redundant functional units on the processor.
my question is, how can gcc compiler knows my cpu has this capability? what kind of technology is gcc using? what optimize option is gcc using to generate this kind of assembly code?
thanks.