0

I am currently studying for my midterm in my Computer Organization class, and on it there is a section where you will need to read and understand code to find the value of a certain register. I was simply wondering if there are any good recommendations where I can probably practice that specific section. After googling it myself I can't seem to find anything of the sort, hence I figured I would ask here.

In addition I'm sorry if this is a duplicate question, I looked around the site and haven't found anything like this, but if there is, a link to it would much appreciated.

Thank You in advance!

flip1012
  • 9
  • 2
  • 3
    Next time you see a C or C++ function and think "I wonder how that compiles for ARM", put it up on http://gcc.godbolt.org/ and select ARM gcc as the compiler. Use `-O3` to get optimized code. Use -fverbose-asm to help you follow the code, if you want. As a bonus, Godbolt makes it super-easy to compare with x86, MIPS, PowerPC. (And with clang or ICC on x86). With the recent UI improvement, you can even have two different compiler-output sub-windows open at once! – Peter Cordes Oct 23 '16 at 22:18
  • 1
    @PeterCordes OP could also install an ARM toolchain on his computer. This has the advantage of not requiring internet access after the installation. – fuz Oct 23 '16 at 22:28
  • @FUZxxl: yeah, but then you need to de-noise the gcc output yourself: http://stackoverflow.com/questions/38552116/how-to-remove-noise-from-gcc-clang-assembly-output. Still, good suggestion if you want to do this on a laptop somewhere. Also good if you want to try modifying the asm to see if it still works after your attempt at hand-optimizing it. (e.g. using user-mode QEMU on a non-ARM Linux system to single-step through an ARM binary with gdb, [like I describe here](http://stackoverflow.com/questions/39503997/how-to-run-a-single-line-of-assembly-then-see-r1-and-condition-flags)). – Peter Cordes Oct 23 '16 at 22:32
  • Have you tried reviewing what is between your wrist and shoulder? – Michael Petch Oct 23 '16 at 22:47
  • Thanks for the helpful Advice I might just do that. – flip1012 Oct 23 '16 at 22:52

2 Answers2

1

I am in the get the tools and just try it camp.

unsigned int fun ( unsigned int a, unsigned int b )
{
    return(a+b+1);
}

As mentioned the problem is the optimizer, if you dont optimize you get something like this:

arm-none-eabi-gcc -c fun.c -o fun.o
arm-none-eabi-objdump -D fun.o

00000000 <fun>:
   0:   e52db004    push    {r11}       ; (str r11, [sp, #-4]!)
   4:   e28db000    add r11, sp, #0
   8:   e24dd00c    sub sp, sp, #12
   c:   e50b0008    str r0, [r11, #-8]
  10:   e50b100c    str r1, [r11, #-12]
  14:   e51b2008    ldr r2, [r11, #-8]
  18:   e51b300c    ldr r3, [r11, #-12]
  1c:   e0823003    add r3, r2, r3
  20:   e2833001    add r3, r3, #1
  24:   e1a00003    mov r0, r3
  28:   e24bd000    sub sp, r11, #0
  2c:   e49db004    pop {r11}       ; (ldr r11, [sp], #4)
  30:   e12fff1e    bx  lr

It is actually quite readable, but takes more work than optimized:

arm-none-eabi-gcc -O2 -c fun.c -o fun.o
arm-none-eabi-objdump -D fun.o

00000000 <fun>:
   0:   e2811001    add r1, r1, #1
   4:   e0810000    add r0, r1, r0
   8:   e12fff1e    bx  lr

You will start to get a feel for how to write simple code that doesnt get optimized out, which is another good lesson IMO. The unoptimized, at least with GCC is going to have a strong desire to setup a stack frame then it is going to take the passed in operands and save those to the stack. Any local or intermediate variables it thinks it needs also get stack space. Every line of C code is handled separately in order, from the stack, so the operands are taken from the stack and results are saved back, even if they come back off right into the same variable. Thus the term optimized, could easy remove a lot of that code by keeping things in registers.

You can compile without the stack frame, an exercise for the reader, a very simple google search.

You can also begin to see the calling convention in action if you can overcome the optimizer (or can tolerate all the stack stuff if not optimized).

    unsigned int more_fun ( unsigned int, unsigned int );
    unsigned int fun ( unsigned int a, unsigned int b )
    {
        return(more_fun(a+1,b+2)+3);
    }

   0:   e92d4010    push    {r4, lr}
   4:   e2811002    add r1, r1, #2
   8:   e2800001    add r0, r0, #1
   c:   ebfffffe    bl  0 <more_fun>
  10:   e8bd4010    pop {r4, lr}
  14:   e2800003    add r0, r0, #3
  18:   e12fff1e    bx  lr

the immediate 1 is added to r0 so that must be our first parameter. 2 goes with r1, the second parameter. Three is added to r0 coming back from the called function, so r0 must be where the return goes for simple functions like this (64 bit return values and structures and stuff are an exercise for the reader, you can also read the recommended calling conventions from arm, but 1) compilers can do whatever they want 2) sometimes it is much easier just to compile a function and disassemble).

The other mystery here is why is r4 pushed? Or maybe your compiler pushes r3 or some other register along with lr, or maybe it doesnt. Another SO question asked many times over without looking for the answer (cause it is hard to search for). ARM recommends keeping the stack 64 bit aligned, r4 in this case is just any arbitrary register, could have been most any of them just needed to push two and pop two. Why then on the prior one they didnt push another register with r11? Well apparently either gnu didnt see the need to worry about stack alignment during interrupts and two the stack adjustment while building the stack frame compensates to make it 64 bit aligned, you just have that interrupt exposure for a few instructions. I dont know what ARM recommends with respect to that.

You can take all the existing code you want of any project of any size and try to read the disassembly, depending on how built you may end up with a lot of stack stuff like the unoptimized code. You may end up with lots of nice optimizations related to loading immediates into registers that cant be done in one instruction but dont want to burn a pc relative load for. And maybe this is exactly what you are after, what real world code looks like and can you read it. You will want both the unoptimized end of it with a LOT more code but is a somewhat linear translation from C to machine code. And optimized code with re-ordering of operations, dead code elimination, tricks related to immediates and other math, as well as tail optimizations and other things. Then if you venture into mixed ARM/thumb more fun happens. You add floating point more fun happens.

No reason to expect any two different branded compilers (gnu, llvm, etc) to produce the same output from the same input, nor is there any reason to expect any two versions of the same brand (for lack of a better term) of compiler to produce the same results from the same source with the same command line options. So that again multiplies the fun.

Bottom line the tools have been there all along, is just a matter of using them.

old_timer
  • 69,149
  • 8
  • 89
  • 168
0

Maybe you can practice doing some exercises in ARM assembly by yourself, to get to know how things work. For instance, compute the sum of an array of integer, then the average, find the largest etc...

Afterward, you'll need to get familiar with the calling convention. Hence, I advise you to read some code from operating system, like DIY bare-metal OS for Raspberry Pi.

Finally, you can practice on some reverse engineering/crackme in ARM, and you'll be more than prepared!

Aif
  • 11,015
  • 1
  • 30
  • 44
  • Practice by itself is great, though I've already made several programs in ARM ranging from a simple parsing of a string to count the vowels all the way to using jump tables to encrypt a message. I guess I can go back and look over them and really familiarize myself with them to make sure I really know whats going on in them. Thanks for the advice though, it was really helpful! – flip1012 Oct 23 '16 at 22:56