0

I would like to ask a question about how to write inline assembly code for Store-Conditional instruction in RISC-V. Below is some brief background (RISCV-ISA-Specification on page 40, section 7.2):

SC writes a word in rs2 to the address in rs1, provided a valid reservation still exists on that address. SC writes zero to rd on success or a nonzero code on failure.

The instruction that we will be focusing on is SC.D - store-conditional a 64-bit value. As shown on page 106 of RISCV-ISA-Specification, the instruction format is as follows:

00011 | aq<1> | rl<1> | rs2<5> | rs1<5> | 011 | rd<5> | 0101111

In order to use inline assembly to generate the corresponding code for SC.W instruction, we need 3 registers. The register list can be found here.

The register field of the instruction is 5 bit each. Hence, there are 32 general registers in RISC-V: x0, x1, ... x31. Each register has its own ABI(application binary interface), for instance, register x16 corresponds to a6 register, hence, the corresponding 5-bit value should be 10000.

I choose the following registers assignment:

  • rs2: a6 register (register x16, i.e. 0b10000)
  • rs1: a7 register (register x17, i.e. 0b10001)
  • rd: s4 register (register x20, i.e. 0b10100)

Hence, by filling in the corresponding register bits of the original instruction, we have the following:

00011 | aq<1> | rl<1> | 10000 | 10001 | 011 | 10100 | 0101111

For the two bits aq and rl, it is used for specifying the ordering constraints (page 40 of RISCV-ISA-Specification):

If both the aq and rl bits are set, the atomic memory operation is sequentially consistent and cannot be observed to happen before any earlier memory operations or after any later memory operations in the same RISC-V hart, and can only be observed by any other hart in the same global order of all sequentially consistent atomic memory operations to the same address domain.

So we just set both bits to 1 since we want SC.D to be executed atomically. Now we have the final instruction bits:

00011 | 1 | 1 | 10000 | 10001 | 011 | 10100 | 0101111

-> 00011111|00001000|10111010|00101111
     0x1f     0x08     0xba     0x2f

Since RISC-V uses little endian, the corresponding inline assembly can be generated by:

__asm__ volatile(".byte 0x2f, 0xba, 0x08, 0x1f");

There are also some other preparations like loading values into rs1(a7) and rs2(a6) registers. Therefore, I have the following code (but it did not work as expected):

/**
 * rs2: holds the value to be written. I pick a6 register.
 * rs1: holds the address to be written to. I pick a7 register.
 * rd: holds the return value of SC.D instruction. I pick s4 register.
 * 
 * @src: the value to be written. rs2. a6 register 
 * @dst: the address to be written to. rs1. a7 register 
 * @rd: the value that holds the return value of SC.D
 */
static inline void sc(void *src, void *dst, uint64_t *rd) {
    uint64_t *tmp_src = (uint64_t *)src;
    uint64_t src_val = *tmp_src; // 13
    uint64_t dst_addr = (uint64_t)dst;
    uint64_t ret = 100;

    // first of all, need to prepare the registers a6 and a7.

    /* load value to be written into register a6 */
    __asm__ volatile("ld a6, %0"::"m"(src_val));

    /* load the address to be written to into register a7 */
    __asm__ volatile("ld a7, %0"::"m"(dst_addr));

    /* the actual SC.D: */
    __asm__ volatile(".byte 0x2f, 0xba, 0x08, 0x1f");
    // __asm__ volatile("sc.d s4, a6, (a7)"); // this does not work either.
    
    /* obtain the value in register s4 */
    __asm__ volatile("sd s4, %0":"=m"(ret));
    *rd = ret;

    return;
}

int main() {
    uint64_t *src = malloc(sizeof(uint64_t));
    uint64_t *dst = malloc(sizeof(uint64_t));
    uint64_t rd = 20;

    *src = 13;
    *dst = 3;

    sc(src, dst, &rd); // write value 13 into @dst, so @dst should be 13 afterwards

    // the expected output should be "dst: 13, rd: 0"
    // What I get: "dst: 3, rd: 1"
    printf("dst: %ld, rd: %ld\n", *src, *dst, rd);

    return 0;
}

The result does not seem to change the dst value. May I know which part I am doing wrong? Any hints would be appreciated.

Ethan L.
  • 395
  • 2
  • 8
  • Just to make sure I understand, do you do a LR to create a reservation before executing the SC? – Joachim Isaksson Jun 28 '22 at 09:20
  • @JoachimIsaksson: yes, that is what I did. I am totally new to inline assembly so I might did something wrong. My thoughts is - since SC instruction requires 3 registers, we need to load corresponding values into those registers before we could execute the SC. This might be the "reservation" you mentioned in your question, but I am not sure. – Ethan L. Jun 28 '22 at 09:37
  • Besides, when I mentioned "I pick a6, a7 and s4 registers", I am not sure if that is the correct way. Will there be registers that are previously used which may cause conflict? I guess it might be. So one possible solution (I am not sure) is by issuing a memory barrier, to make sure that all instructions have finished before entering the current instruction. However, that may cause the instruction itself to be less efficient. I would like to find a way to make my inline assembly more generic, but I am still searching for them. – Ethan L. Jun 28 '22 at 09:42
  • If you need register values to survive between instructions, make sure all the instructions are in the same `asm` statement. If the assembler doesn't support the instruction directly, yes the easiest thing is to hard-code some registers, and tell the compiler which ones to pick using `register int *ptr asm("a7")` for example, to force the compiler to pick `a7` for a `"r"(ptr)` input. And make sure to declare a clobber on whatever reg you use as a temporary. See https://stackoverflow.com/tags/inline-assembly/info for docs and guides. – Peter Cordes Jun 28 '22 at 10:43
  • @PeterCordes: Thanks a lot for your answer and the link you provided. I know there is a tag but I did not know StackOverflow has a specific tag "wiki" corresponding to various topics. That is really helpful. I would go through those and see if I could figure out the solution. – Ethan L. Jun 28 '22 at 10:56
  • 1
    [ARM inline asm: exit system call with value read from memory](https://stackoverflow.com/a/37363860) is the Q&A I had in mind for an example of using asm register vars. [How can I indicate that the memory \*pointed\* to by an inline ASM argument may be used?](https://stackoverflow.com/q/56432259) is relevant for asm statements with pointer operands you deref, when you want the pointer in a register. – Peter Cordes Jun 28 '22 at 11:00

0 Answers0