When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.
For writechar
and readchar
, we only need a "syscall"
as the template, with constraints to set up all the inputs in registers (and the pointed-to char
in memory for the write(2)
system call), according to the x86-64 Linux system-call convention (which very closely matches the System V ABI's function-calling convention). What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.
This also makes it easy to avoid clobbering the red-zone (128 bytes below RSP), where the compiler might be keeping values. You must not clobber it from inline asm (so push
/ pop
aren't usable unless you sub rsp, 128
first: see Using base pointer register in C++ inline asm for that and many useful links about GNU C inline asm), and there's no way to tell the compiler you clobber it. You could build with -mno-redzone
, but in this case input/output operands are much better.
I'm hesitant to call these putchar
and getchar
. You can do that if you're implementing your own stdio that doesn't support buffering yet, but some functions require input buffering to implement correctly. For example, scanf
has to examine characters to see if they match the format string, and leave them "unread" if they don't. Output buffering is optional, though; you could I think fully implement stdio with functions that create a private buffer and write()
it, or directly write()
their input pointer.
writechar()
:
int writechar(char my_char)
{
int retval; // sys_write uses ssize_t, but we only pass len=1
// so the return value is either 1 on success or -1..-4095 for error
// and thus fits in int
asm volatile("syscall #dummy arg picked %[dummy]\n"
: "=a" (retval) /* output in EAX */
/* inputs: ssize_t read(int fd, const void *buf, size_t count); */
: "D"(1), // RDI = fd=stdout
"S"(&my_char), // RSI = buf
"d"(1) // RDX = length
, [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
/* clobbered regs */
: "rcx", "r11" // clobbered by syscall
);
// It doesn't matter what addressing mode "m"(my_char) picks,
// as long as it refers to the same memory as &my_char so the compiler actually does a store
return retval;
}
This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.
writechar:
movb %dil, -4(%rsp) # store my_char into the red-zone
movl $1, %edi
leaq -4(%rsp), %rsi
movl %edi, %edx # optimize because fd = len
syscall # dummy arg picked -4(%rsp)
ret
@nrz's test main
inlines it much more efficiently than the unsafe (red-zone clobbering) version in that answer, taking advantage of the fact that syscall
leaves most registers unmodified so it can just set them once.
main:
movl $97, %r8d # my_char = 'a'
leaq -1(%rsp), %rsi # rsi = &my_char
movl $1, %edx # len
.L6: # do {
movb %r8b, -1(%rsp) # store the char into the buffer
movl %edx, %edi # silly compiler doesn't hoist this out of the loop
syscall #dummy arg picked -1(%rsp)
addl $1, %r8d
cmpb $123, %r8b
jne .L6 # } while(++my_char < 'z'+1)
movb $10, -1(%rsp)
syscall #dummy arg picked -1(%rsp)
xorl %eax, %eax # return 0
ret
readchar(), done the same way:
int readchar(void)
{
int retval;
unsigned char my_char;
asm volatile("syscall #dummy arg picked %[dummy]\n"
/* outputs */
: "=a" (retval)
,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char
/* inputs: ssize_t read(int fd, void *buf, size_t count); */
: "D"(0), // RDI = fd=stdin
"S" (&my_char), // RDI = buf
"d"(1) // RDX = length
: "rcx", "r11" // clobbered by syscall
);
if (retval < 0) // -1 .. -4095 are -errno values
return retval;
return my_char; // else a 0..255 char / byte
}
Callers can check for error by checking c < 0
.