I need to profile different machine instruction for a project, so I'm running some instructions in a loop of ~200 instructions per time (using .rept
in an __asm__
directive). The processor I'm using is an ARM Cortex-M4. I need now to test ARM's conditional instructions. If I enter something like
".rept 200\n\t"
"addeq r1, r1, r1\n\t"
".endr\n\t"
I get
Error: thumb conditional instruction should be in IT block -- `addeq r1,r1,r1'
Now, IT blocks can have up to 4 instructions, so the best I could do with them is something like
".rept 200\n\t"
"ITTTT EQ\n\t"
".rept 4\n\t"
"addeq r1, r1, r1\n\t"
".endr\n\t"
".endr\n\t"
yielding a binary like
80003ae: bf01 itttt eq
80003b0: 1849 addeq r1, r1, r1
80003b2: 1849 addeq r1, r1, r1
80003b4: 1849 addeq r1, r1, r1
80003b6: 1849 addeq r1, r1, r1
This way, however, 1 in 5 instruction will not be the one I want to profile (causing some noise in the measures I take). Since I heard that IT blocks are enforced by the Thumb-2 ISA, and that complete ARM can use conditional instructions even without them, my question is: can I instruct the assembler to use them? Moreover, if I heard correctly and Thumb-2 requires them, is there a way to further reduce the "noise"? (better than 1/5 instructions?)
Thanks!
EDIT: I got a lot of useful comments (thanks!), but I realized I missed some important information to better understand my goal, I apologize for that. I'm trying to profile the power consumption of the CPU, so effectively it does a difference if the IT block is "executed" or not, which is the resulting binary encoding ecc., while the clock cycles needed are not the focus here.
I think this means (but correct me if I'm wrong) that even if Thumb-2 cleverly hides the IT block complexity, I should see a power difference, multimeter at hand.