0

I am in the process of trying to figure out how many cycles some uint32 operations will take on a 16bit dsPIC. I started with bitwise AND and wrote the following program:

int main(void) {
    
    unsigned long var1, var2, var3;
    
    var1 = 80000ul;
    var2 = 190000ul;
    while (1) {
        var3 = var1 & var2;
    }
    var1 = 0;
    return 0;
}

Looking at the disassembly to see what the compiler came up with for the assembly I got the following:

!        var3 = var1 & var2;
0x2DE: MOV [W14+4], W0
0x2E0: MOV [W14+6], W1
0x2E2: MOV.D [W14], W2
0x2E4: MOV W2, W4
0x2E6: MOV W3, W2
0x2E8: MOV W0, W3
0x2EA: MOV W1, W0
0x2EC: AND W4, W3, W4
0x2EE: AND W2, W0, W0
0x2F0: CLR W1
0x2F2: SL W0, #0, W1
0x2F4: MOV #0x0, W0
0x2F6: MOV.D W0, W2
0x2F8: MUL.UU W4, #1, W0
0x2FA: IOR W2, W0, W2
0x2FC: IOR W3, W1, W3
0x2FE: MOV W2, [W14+8]
0x300: MOV W3, [W14+10]

20 cycles, 6 I/O moves and 14 core. This looks bonkers to me. Couldn't it just do this?

MOV.D [W14+4], W0
MOV.D [W14], W2
AND W0, W2, W0
AND W1, W3, W1
MOV.D W0, [W14+8]

That drops core cycles to 2 for the core which makes logical sense to me at least (2 16-bit-wide AND's). What is the compiler up to that I don't understand?

ChateauDu
  • 23
  • 4
  • 1
    Did you enable optimization when you compiled? If not, don't expect efficient code. But since your function doesn't have any visible side effects (just manipulates local vars), expect it to compile to just an empty infinite loop. See [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) for how to write a function that takes args and returns a value so you can see optimized asm that actually does something. – Peter Cordes Jul 21 '20 at 04:07
  • That said, IDK why you'd get a multiply. What if your function isn't called `main`? Remember that `main` is special, and compilers may stick extra stuff into the start or end of `main`. – Peter Cordes Jul 21 '20 at 04:11
  • No the GCC optimization level was 0. I am new to this so I guess I didn't expect optimization to matter for something as simple as this. I will check out that link and see what effects optimization has! Thank you! – ChateauDu Jul 21 '20 at 04:30
  • 1
    `-O0` makes a huge difference for anything involving multiple C statements ([Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394)), but there's sometimes plenty of difference even for a single statement. – Peter Cordes Jul 21 '20 at 05:11

0 Answers0