2

I want to use inline assembly to read two memory locations into C variables and then store two C variables to other memory locations. The inline code that I have written looks like this:

unsigned int us32 = (uint32_t)us;
__asm__ __volatile__("ldr %1, [%4, #0x10]\r\n"    //read my 2 memory locations
                     "ldr %0, [%4, #0x18]\r\n" 
                     "str %2, [%4, #0x14]\r\n"    //write my 3 memory locations
                     "str %2, [%4, #0x18]\r\n"    
                     "str %3, [%4, #0x10]\r\n"
                     : "=l" (leftover_time), "=l" (rollflag)
                     : "l" (us32), "l" (ena), "l" (s)
                     : "memory", "cc");

The assembly that is generated from my inline code, however, doesn't seem to work. It loads the variables that I want to store into r2 and r3 and then promptly clobbers them with the variables I am trying to load. This is pretty clear from the disassembly below, which I got using arm-none-eabi-objdump

 us32 = (uint32_t)us;
 c8e:       6bbb            ldr     r3, [r7, #56]   ; 0x38
 c90:       637b            str     r3, [r7, #52]   ; 0x34
__asm__ __volatile__("ldr %1, [%4, #0x10]\r\n"
 ;;; These 4 instructions load the variables I want to write to memory
 ;;; into r2 and r3
 c92:       2207            movs    r2, #7
 c94:       4b39            ldr     r3, [pc, #228]  ; (d7c <reschedule+0x16c>)
 c96:       6819            ldr     r1, [r3, #0]
 c98:       6b7b            ldr     r3, [r7, #52]   ; 0x34

 ;;; BOOM!!  r2 and r3 have been clobbered, they no longer contain the 
 ;;; values that I want to write (a constant #7 and unsigned in us32).
 ;;; 
 ;;; The data that I want to read is indeed pointed by r1 + 16 and r1 + 24
 c9a:       690b            ldr     r3, [r1, #16]
 c9c:       698a            ldr     r2, [r1, #24]
 c9e:       614b            str     r3, [r1, #20]
 ca0:       618b            str     r3, [r1, #24]
 ca2:       610a            str     r2, [r1, #16]
 ca4:       633a            str     r2, [r7, #48]   ; 0x30
 ca6:       62fb            str     r3, [r7, #44]   ; 0x2c

I've read different inline assembly tutorials for hours and I've checked & double-checked my input/output constraints and I'm just sitting here scratching my head. Can someone catch my mistake?

I am using arm-none-eabi-gcc version 4.8.4

John M
  • 1,484
  • 1
  • 14
  • 27
  • Your code doesn't clobber flags, so you don't need a `"cc"` clobber. You could also avoid the `"memory"` clobber if you used memory output operands. (Either a dummy struct covering all the stores like the clobber documentation suggests, or individual memory operands for each store to let gcc choose the addressing mode. `[named] "r" (operands)` are highly recommended when you have many operands, to avoid confusion when adding more renumbers them.) – Peter Cordes May 19 '16 at 19:13
  • Can you use `strd` with the same register twice, to do the first two stores with one instruction? If no, you could reorder to do the #10 and #14 stores in one strd, but that will be worse on in-order CPUs. – Peter Cordes May 19 '16 at 19:17
  • 1
    @PeterCordes The "cc" flags are left over from some other code that I wrote; I'll remove them. Thanks for the tips about named registers & the struct. Also, I can't do `strd` because I am using m0+, which does not support that instruction. – John M May 19 '16 at 19:33

2 Answers2

3

The relevant paragraph is hidden right in the middle of the extended asm documentation:

Use the & constraint modifier (see Modifiers) on all output operands that must not overlap an input. Otherwise, GCC may allocate the output operand in the same register as an unrelated input operand, on the assumption that the assembler code consumes its inputs before producing outputs. This assumption may be false if the assembler code actually consists of more than one instruction.

If you had the stores before the loads, you would actually meet that assumption and everything would be fine. Since you don't, then you need to mark the outputs as earlyclobber operands, i.e. "=&l".

Notlikethat
  • 20,095
  • 3
  • 40
  • 77
  • Wow... that totally fixed my issue. I can't believe I wasn't able to find that after all the reading I did. It's pretty surprising to me that gcc does not assume the & automatically. – John M May 19 '16 at 18:16
  • 1
    @johnny_boy: GNU C inline asm syntax is designed for wrapping a single instruction that you can't get the compiler to emit, which is why the default is that way. Usually it's best to let the compiler do as much as possible. See [the bottom of this answer](http://stackoverflow.com/questions/34520013/using-base-pointer-register-in-c-inline-asm/34522750#34522750) for a collection of links to GNU C inline asm guides and Q&As. They all use x86, but the concepts apply equally to ARM. Writing GNU C inline asm is the hardest way to learn asm, because it's so easy to get constraints wrong. – Peter Cordes May 19 '16 at 19:09
  • @johnny_boy In most cases you'd _want_ any input registers to be reused for output registers where possible, if the alternative means spilling to the stack around the asm. – Notlikethat May 19 '16 at 20:03
2

You clobbered r2 and r3 before consuming the inputs and you forgot to inform the compiler. You need an & early-clobber modifier for them:

Use the ‘&’ constraint modifier (see Modifiers) on all output operands that must not overlap an input. Otherwise, GCC may allocate the output operand in the same register as an unrelated input operand, on the assumption that the assembler code consumes its inputs before producing outputs. This assumption may be false if the assembler code actually consists of more than one instruction.

Source

Jester
  • 56,577
  • 4
  • 81
  • 125