I have a strange situation that seems to be working well for me, but I need to know how to either make this better or how to live this.
I am using C++ as a compiled scripting language for a game engine. The RISC-V system call ABI is the same as the C function calling convention, with the exception that instead of an 8th integer or pointer argument, A7 is used for the system call number. Yes, you know where this is going. Behold:
extern "C" long syscall_enter(...);
template <typename... Args>
inline long syscall(long syscall_n, Args&&... args)
{
asm volatile ("li a7, %0" : : "i"(syscall_n));
return syscall_enter(std::forward<Args>(args)...);
}
While syscall_enter is just a symbol in .text with the syscall instruction and a ret. The system call return value is also the same register as a normal function return.
000103f0 <syscall_enter>:
syscall_enter():
103f0: 00000073 ecall
103f4: 00008067 ret
Before this, I had to create 20+ functions to cover all the various ways to make system calls with integers and pointers with compiler barrier, and when I wanted to add a function that took floating-point values it would say the call was ambigous as integers and floats can be converted back and forth. So, I could either start to add unique names to the functions, or just solve this mess a better way. It was honestly irritating and putting a damper on an otherwise excellent experience. I really love being able to use C++ on "both sides".
The instructions generated by the compiler seems alright. It JAL and JALR syscall_enter, which is fine. The compiler seems a little bit confused, but I don't mind one extra instruction.
10204: 1f500793 li a5,501
10208: 00078893 mv a7,a5
1020c: 00000513 li a0,0
10210: 1e0000ef jal ra,103f0 <syscall_enter>
As well as center camera on position:
100d4: 19600793 li a5,406
100d8: 00078893 mv a7,a5
100dc: 000127b7 lui a5,0x12
100e0: 4207b587 fld fa1,1056(a5) # 12420 <_exit+0x2308>
100e4: 22b58553 fmv.d fa0,fa1
100e8: 010000ef jal ra,100f8 <syscall_enter>
Again one extra move instruction. Looks alright. The API is heavily in use already, and there is also a threading API which works with this.
Now, is there an even better way? I couldn't think of a better way to load a7 with a number and then force the compiler to set a function call up, without making an actual function call. I was thinking about using a template parameter for the system call number, but I'm not so sure about the rest. Maybe we can constrain the number of arguments to 7? It won't be correct when there are integer and floating-point arguments, but that's fine. Stack-stored structs are easy to pass.
After some testing, I have decided to use this:
extern "C" long syscall_enter(...);
template <typename... Args>
inline long syscall(long syscall_n, Args&&... args)
{
// This will prevent some cases of too many arguments,
// but not a mix of float and integral arguments.
static_assert(sizeof...(args) < 8, "There is a system call limit of 8 integer arguments");
// The memory clobbering prevents reordering of a7
asm volatile ("li a7, %0" : : "i"(syscall_n) : "a7", "memory");
return syscall_enter(std::forward<Args>(args)...);
asm volatile("" : : : "memory");
}
Should suffice. No need to for syscall function spam. The check to count arguments is not optimal, since it should only prevent the usage of the 8th integral register (which means counting integral, pointer and reference parameters). But it will prevent some cases.