Clang 5.0 is just too old for the csrr
pseudo-instruction, i.e. the pseudo-instruction support in Clang 5.0 is incomplete. Support for csrr
was added in 2018 while Clang 5 was released in 2017.
Either you upgrade to a newer Clang version or you work around this issue by expanding cssr
(control-and-status-register-read) to cssrs
(control-and-status-register-read-and-set) directly in your code, i.e.
csrr dst, csr => csrrs dst, csr, x0
Note that there are even more specialized pseudo-instructions for reading performance related counters such as rdcycle dst
, and rdtime dst
etc. Of course, they also expand to cssrs
, but might be more convenient to use for some use cases.
Also, with a complete toolchain, you can also use the symbolic constants cycle
, time
, instret
etc. (instead of 0xc00
, 0xc01
, 0xc02
etc.) directly in you assembler code.
Example that lists the equivalent ways to read out the cycle count:
extern __inline
unsigned long
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
rdcycle(void)
{
unsigned long dst;
// output into any register, likely a0
// regular instruction:
asm volatile ("csrrs %0, 0xc00, x0" : "=r" (dst) );
// regular instruction with symbolic csr and register names
// asm volatile ("csrrs %0, cycle, zero" : "=r" (dst) );
// pseudo-instruction:
// asm volatile ("csrr %0, cycle" : "=r" (dst) );
// pseudo-instruction:
//asm volatile ("rdcycle %0" : "=r" (dst) );
return dst;
}
I don't think that it's worth it using the C-pre-processor (CPP) for reading the different CSRs.
Note that even with asm volatile
the compiler is free to reorder the different inline assembler statements to each other and to other instructions. For example, in
unsigned long a = rdcycle();
int r = i * i;
unsigned long b = rdcycle();
the second CSR access might be reordered before the multiplication or even the before the first one.