I am playing with my own fibers; more for educational sake.
My platform is x86-64 & Linux.
Here is my context-switching routine
__taskflow_switch_cpu_context:
// preserve the $rbp, plays together with the `-fno-omit-frame-pointer`
pushq %rbp
movq %rsp, %rbp
// save the registers
pushq %r15
pushq %r14
pushq %r13
pushq %r12
pushq %rbx
movq %rsp, (%rdi) // memorize a leaving fiber's $rsp
movq (%rsi), %rsp // restore a new coming fiber's $rsp
// restore the registers in the reverse order
popq %rbx
popq %r12
popq %r13
popq %r14
popq %r15
// preserved %rbp no longer needed
popq %rbp
retq
The code above works well. It does change a context; it does allow nested functions to be called.. well, almost.
When I have a printfn("some pattern %i\n", no_matter_how_you_get_an_int())
call inside my fiber, it falls with a segmentation fault. However, the printfn("no patterns here\n")
works well too.
I though it is because of an %rbp doing a gigantic jump, but seems I was mistaken.
P.S. Once a fiber is created, a correct return address is pushed manually and correct offsets are ensured. Roughly here it is:
/*
x86 stack grows towards lesser memory addresses;
so `res->stack + res->stack_size` is it's very beginning.
*/
uintptr_t *stack = (uintptr_t *)(res->stack + res->stack_size - sizeof(void *));
// An initial return addess.
*stack = (uintptr_t)fn; // A correct return address is there now.
// more unrelated code
res->cpu_context->rsp = (res->stack + res->stack_size - sizeof(void *) * 7); // Ensures some room for the callee/caller-save registers altogether with a parent `rbp`.
Once again: since it works fine for "non-printfn
" calls, I assume the initial stack layout is all right.
P.P.S. -fno-omit-frame-pointer
is there too.
P.P.P.S. As Peter Cordes says in the comments, movaps
seems to be the problem here (see the attached screenshot from my GDB session).