0

I've been attempting to write some assembly code that should be equivalent to this C code,

int test( unsigned int v1, unsigned int v2 )
{
    int res1, res2;

    res1 = bit_pos( v1 );
    res1 = bit_pos( v2 );

    return res1 + res2;
}

Test should accept two arguments, r0 and r1, calls bit_pos for each argument and then adds the results. My current progress is the following :-


        .arch armv4
        .syntax unified
        .arm
        .text
        .align 2
        .type bit_pos, %function
        .global bit_pos
bit_pos:
        mov r1,r0
        mov r0, #1
top:
        cmp r1,#0
        beq done
        add r0,r0,#1
        lsr r1, #1
        b top

done: 
        mov pc, lr

        .align 2
        .type test, %function
        .global test

test:
        push {r0, r1, r2, lr}
        mov r0, #0x80000000
        bl bit_pos
        mov r1, r0
        mov r0, #0x00000001
        bl bit_pos
        mov r2, r0
        pop {r0, r1, r2, lr}
        mov pc, lr

I tried multiple attempts for the test function but the results always fail to pass, the issue is in the test function but the bit_pos is performing properly.

I need to pass the following test cases

checking 5 20 res=6
checking 1 0 res=-1
checking 175 100000 res=21

but currently I'm failing with

checking 5 20 res=5
got 5, expected 6

I'm really horrible at assembly but I promise I given this my best shot, It's been 5 hours and I'm tired.

My current attempt

test:
        push {r4, lr}
        mov r4, r0
        bl bit_pos
        bl bit_pos
        add r0, r4
        pop {r4, lr}
        mov pc, lr
checking 5 20 res=6
checking 1 0 res=0
got 0, expected -1

Test failed
Jeff924D
  • 13
  • 4
  • `r1` is call-clobbered in the standard calling convention, including the way you're using it in `bit_pos`. But your `test()` caller seems to assume that its `r1` value will survive across the 2nd call. And `test` strangely saves/restores its caller's r0..2, so it's not returning any values. Save/restore `{r4, lr}` around the function, and keep the first return value in `r4`. After the 2nd call, `add r0, r4` then return, with the `int` return value in `r0`, exactly like a compiler would. – Peter Cordes Nov 20 '22 at 05:10
  • Oh wait, looking at compiler output highlights a bug in your C source: both return values are assigned to `res1`, neither to `res2`. So the first return value isn't needed, and the return value depends on reading an uninitialized value. https://godbolt.org/z/3cf4dqaWa . (It's a bit more complicated when `test` uses its incoming args instead of just `mov`-immediate constants; it has to save `v2` across the first call.) – Peter Cordes Nov 20 '22 at 05:14
  • You could of course just use a custom calling convention where `bit_pos` actually takes another arg, a counter value to increment, e.g. in `r0`, so instead of having to count up from zero for the caller to add, it could just increment that count as it slowly loops one bit at a time. (There are much faster ways to find the position of the highest bit, ARM has a `clz` instruction to count leading zeros, so the position is `31-clz(x)`. But if you want simple and easy to understand, yeah a loop that shifts out bits one at a time works.) – Peter Cordes Nov 20 '22 at 05:20
  • I tried your first comment Peter and it seems to pass the first test case now, but it fails on the second, can you look at the updated post? – Jeff924D Nov 20 '22 at 05:21
  • In your updated attempt, you do two `bl` calls back to back, with no instructions to save the first return value anywhere or to pass a different arg to the 2nd call. If you single-step with a debugger, single-step to that point and think about what's in registers when the first call returns, and what needs to change before the second. – Peter Cordes Nov 20 '22 at 05:22
  • I'll try again, thanks for your help, I appreciate it alot – Jeff924D Nov 20 '22 at 05:24
  • I figured it out! Thank you! – Jeff924D Nov 20 '22 at 05:33
  • If you can post your comments as an answer I will approve it, thank you so much! that was a really helpful explanation. – Jeff924D Nov 20 '22 at 05:34
  • Pretty sure this kind of question has been asked before. Answers like [What are callee and caller saved registers?](https://stackoverflow.com/a/56178078) explain the general concept of how to use registers for temporary values vs. ones that need to survive across a call. And similarly [Why should certain registers be saved? What could go wrong if not?](https://stackoverflow.com/q/69419435). Also [What is the usecase of Scratch Registers in ARM?](https://stackoverflow.com/q/62840714) is good. – Peter Cordes Nov 20 '22 at 05:48
  • I'll check them out, they look really useful for my case. thanks again – Jeff924D Nov 20 '22 at 05:55

0 Answers0