RISC-V inline assembly

Question

I'm quite new to inline assembly, so I need your help to be sure that I use it correctly. I need to add assembly code inside my C code that is compiled with the Risc-v toolchain. Please consider the following code:

int bar = 0xFF00;

int main(){
    volatile int result;
    int k;
    k = funct();
    int* ptr;
    ptr = &bar;
    asm volatile (".insn r 0x33, 0, 0, a4, a5, a3":
                       "=m"(*ptr), "=r"(result):
                       [a5] "m"(*ptr), [a3] "r"(k) :
                      );
        
    }
...

What I want to do is bar = bar+k. Actually, I want to change the content of the memory location that bar resides in. But the code that I wrote gets the address of bar and adds it to k. Does anybody know what the problem is?

What is that single instruction with opcode 0x33 supposed to do? RISC-V is a load/store machine; a single instruction can't load + add + store, so you'll need to take `"r"` and `"=r"` register input / output operands with the compiler emitting loads and stores. Unless you added a custom memory-destination instruction to the ISA? Also, you hard-code some register names but didn't tell the compiler to pick those registers for the `"r"` constraints, so that can't work. — Peter Cordes, Dec 07 '22 at 20:13
opcode 0x33 does an addition operation. No I don't add a custom memory-destination operation to the ISA. I tried without hard coding the register name and also taking both ```result``` and ```k``` as input/output operand, but it still doesn't work. — Sooora, Dec 07 '22 at 20:41
What compiler are you actually using? GCC? clang? IAR? Apparently IAR has its own meaning for `"a3"(foo)`, different from standard GNU C. — Peter Cordes, Dec 18 '22 at 21:04

score 2 · Answer 1 · answered Dec 18 '22 at 11:36

Unfortunately, you have misunderstood the syntax.

In the assembler string, you can either refer to an argument using %0, %1, where the number is the n:th argument passed to the asm directive. Alternatively, you can use the symbolic name, %[myname] which refers to the argument in the form [myname]"r"(k).

Note that the symbolic name is the same as using the number, the name itself doesn't imply anything. In you example, one could get the impression that you are forcing the code to use a specific processor register. (There is another syntax for that, if you really need to use it.)

For example, if you write something like:

int bar = 0xFF00;

int main(){
    volatile int result;
    int k;
    k = funct();
    int* ptr;
    ptr = &bar;
    asm volatile (".insn r 0x33, 0, 0, %[res], %[res], %[ptr]":
                  [res]"+r"(result) : [ptr]"r"(ptr));
}

The IAR compiler will emit the following. As you can see a0 has been assigned the result variable (using the symbolic name res) and a1 assigned the variable ptr (here, the symbolic name is the same as the variable name).

   \   000014 0001'2503    lw        a0, 0x0(sp)
   \   000018 0000'05B7    lui       a1, %hi(bar)
   \   00001C 0005'8593    addi      a1, a1, %lo(bar)
   \   000020 00B5'0533    .insn r 0x33, 0, 0, a0, a0, a1
   \   000024 00A1'2023    sw        a0, 0x0(sp)

You can read more about the IAR inline assembly syntax in the book "IAR C/C++ Development Guide Compiling and linking for RISC-V", in chapter "Assembler Language Interface". The book is provided as a PDF, which you can access from within IAR Embedded Workbench.

sharpgeek · Answer 2 · 2022-12-20T10:15:40.187

1

Based on the snippet provided in your question, I tried the following code with the IAR C/C++ Compiler for RISC-V:

int funct();
int funct() { return 0xA5; } // stub

int bar = 0xFF00;

int main() {
    int k = funct();
    int* ptr = &bar;
    asm volatile (".insn r 0x33, 0, 0, %[res], %[ptr], %[k]"
                    : [res]"=r"(*ptr)
                    : [ptr]"r"(*ptr), [k]"r"(k));
}

In this case, the .insn directive will generate add r,r,r which is effectively *ptr = *ptr + k.

In an earlier version of this answer it was assumed that there would be a requirement to be explicit about which registers to use. For that, explicit register selectors were used as the IAR compiler simply allows it (e.g., "a3", ="a3", "a4", "a5", etc.). At that point, as noted by @PeterCordes in the comments, GCC offered a different set of constraints and would require a different solution. However, if there is no need to be explicit about the registers, it is better to let the compiler decide which ones can be used directly. It will generally impose less overhead.

edited Dec 20 '22 at 10:15

answered Dec 17 '22 at 10:46

sharpgeek

481
2
14

1

`"a3"` is the compiler's choice of `"a"` or `"3"`. And `"=a3"` doesn't compile; GCC says `matching constraint not valid in output operand`. Numbers in constraints like `"3"` for input operands mean to pick the same register as operand 3. Multiple characters in the same constraint are alternatives, not register names. Like `"rm"` gives it a choice of register or memory operands. (Makes sense on a CISC where some instructions allow either.) – Peter Cordes Dec 17 '22 at 19:58
1

To force `"r"` or `"=r"` to pick a specific register, use register-asm local variables; that's what they're for. https://stackoverflow.com/tags/inline-assembly/info / [ARM inline asm: exit system call with value read from memory](https://stackoverflow.com/a/37363860) – Peter Cordes Dec 17 '22 at 19:58
Thanks, @PeterCordes. Those were great feedback. The biggest problem perhaps was the "RISC-V toolchain" disambiguation. I was using the IAR iccriscv v3.10 for testing on bare metal, however as it seems it has a [different set of constraint options](https://netstorage.iar.com/FileStore/STANDARD/001/001/343/riscv/doc/EWRISCV_DevelopmentGuide.ENU.pdf#page=155) as the ones provided by GCC., allowing to be explicit about the registers to be used. I will update my answer clarifying the toolchain which I used earlier for testing. – sharpgeek Dec 18 '22 at 10:51
The OP hasn't said what compiler they're using, but both this and the other answer are about it, not GCC or clang. We should get that sorted out and tag as appropriate for the compiler. – Peter Cordes Dec 18 '22 at 20:58
You'd only need a `"memory"` clobber [if you were loading or storing inside the asm statement itself](https://stackoverflow.com/questions/56432259/how-can-i-indicate-that-the-memory-pointed-to-by-an-inline-asm-argument-may-be), to an operand that wasn't `"m"` or `"=m"` respectively. In this case, the asm template contains purely a register instruction , and the `(*ptr)` as an output operand tells the compiler to assign the output register to that C object. The compiler is even responsible for emitting the store instruction, so it definitely knows about it. – Peter Cordes Dec 18 '22 at 21:02
1

Good edit, yes, letting the compiler pick registers is part of the point of `.inst`, instead of hard-coding a whole 32-bit word. Re: portability. GNU C documents `asm` and `__asm__`, the latter working even with `-std=c99` stricter ISO C which keeps the global namespace clean. https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html. It happens to also support `__asm`, perhaps for compat with some compilers like IAR. – Peter Cordes Dec 20 '22 at 09:20
1

But wait a minute, the destination is write-only, so the output operand should be `"=r"` not `"+r"`, no need to get the compiler to load an extra copy of the register into it. Also, your template string doesn't use `%[res]`, so it might write a different register than the one the compiler picked! By avoiding an early-clobber declaration, you're letting the compiler pick the same register for the output as for either of the inputs, or a different one if it wants to preserve the original value. – Peter Cordes Dec 20 '22 at 09:24
You are right again! It was using a0 (res) and a2 (ptr) whereas it should be using the same register. Thanks for pointing it out! – sharpgeek Dec 20 '22 at 09:35
You should use 3 different names for the 3 operands. There's no benefit to trying to force the compiler to pick the same register for the destination and first source; it will if that's more efficient, with an `"=r"` output and two `"r"` inputs. – Peter Cordes Dec 20 '22 at 09:41
The template `".insn r 0x33, 0, 0, %[res], %[ptr], %[k]"` will generate `lw a0, 0(a2)\n\tadd a0,a0,a1\n\tsw a0, 0(a2)`, whereas the template `".insn r 0x33, 0, 0, %[res], %[res], %[k]"` will generate the exact same code (at least, for the simple example snippet). Wouldn't be more clear to leave `%[res], %[res], %[k]`? – sharpgeek Dec 20 '22 at 10:06
That's much less clear, and buggy. It's reading an output-only operand, `%[res]`. The fact that GCC picks the same register for `[res]` as for `"r"(*ptr)` is only an implementation detail/choice, an optimization; something that could change with different surrounding code if it wants to reorder it with some other read of `*ptr`, so that input value is still available after the `add`. (Or a read of `bar` directly; the pointer variable isn't doing anything useful here and will fully optimize away.) – Peter Cordes Dec 20 '22 at 10:10

RISC-V inline assembly

2 Answers2