0

I am using riscv64-unknown-elf-clang, "clang version 5.0.0" to compile my code and then run it with "spike" and "pk" . I need to calculate the no of clock cycles the program takes. I used "__builtin_readcyclecounter()" or normal "clock()" to calculate clock cycles but none seems to work.

The below code works with riscv64-unknown-elf-gcc but not with riscv64-unknown-elf-clang

#define read_csr(reg) ({ unsigned long __tmp;asm volatile ("csrr %0, " #reg : "=r"(__tmp));__tmp; })
#define CSR_CYCLE 0xc00
#define CSR_TIME 0xc01
#define CSR_INSTRET 0xc02
#define CSR_MCYCLE 0xb00

Then from the main program I called

long cycles;
cycles=read_csr(cycle);
Sourav Das
  • 97
  • 2
  • 15

1 Answers1

0

Clang 5.0 is just too old for the csrr pseudo-instruction, i.e. the pseudo-instruction support in Clang 5.0 is incomplete. Support for csrr was added in 2018 while Clang 5 was released in 2017.

Either you upgrade to a newer Clang version or you work around this issue by expanding cssr (control-and-status-register-read) to cssrs (control-and-status-register-read-and-set) directly in your code, i.e.

csrr dst, csr => csrrs dst, csr, x0 

Note that there are even more specialized pseudo-instructions for reading performance related counters such as rdcycle dst, and rdtime dst etc. Of course, they also expand to cssrs, but might be more convenient to use for some use cases.

Also, with a complete toolchain, you can also use the symbolic constants cycle, time, instret etc. (instead of 0xc00, 0xc01, 0xc02 etc.) directly in you assembler code.

Example that lists the equivalent ways to read out the cycle count:

extern __inline
    unsigned long 
    __attribute__((__gnu_inline__, __always_inline__, __artificial__))
rdcycle(void)
{
    unsigned long dst;
    // output into any register, likely a0
    // regular instruction:
    asm volatile ("csrrs %0, 0xc00, x0" : "=r" (dst) );
    // regular instruction with symbolic csr and register names
    // asm volatile ("csrrs %0, cycle, zero" : "=r" (dst) );
    // pseudo-instruction:
    // asm volatile ("csrr %0, cycle" : "=r" (dst) );
    // pseudo-instruction:
    //asm volatile ("rdcycle %0" : "=r" (dst) );
    return dst;
}

I don't think that it's worth it using the C-pre-processor (CPP) for reading the different CSRs.

Note that even with asm volatile the compiler is free to reorder the different inline assembler statements to each other and to other instructions. For example, in

unsigned long a = rdcycle();
int r = i * i;
unsigned long b = rdcycle();

the second CSR access might be reordered before the multiplication or even the before the first one.

Community
  • 1
  • 1
maxschlepzig
  • 35,645
  • 14
  • 145
  • 182