7

I have reduced a huge fiber scheduler code that was producing the problem to the lines below.
What I expect is a clean return to the context, passed to the handler, every time.
What I get is "Handler. " printed out three times and then a Segmentation Fault.

#include <ucontext.h>
#include <signal.h>
#include <stdio.h>

ucontext_t currently_executed_context;

void handler_sigusr1(int signum, siginfo_t* siginfo, void* context)
{
    currently_executed_context = (*(ucontext_t*)context);

    printf("Handler. ");
    setcontext(&currently_executed_context);
}

int main()
{
    setbuf(stdout,0);

    struct sigaction action_handler;

    action_handler.sa_sigaction = handler_sigusr1;
    action_handler.sa_flags = SA_SIGINFO;

    sigaction(SIGUSR1,&action_handler,NULL);

    for(;;) { kill(getpid(),SIGUSR1); sleep(1); }

    return 0;
}

Used both gcc-4.4.3 and gcc-4.4.5 on two different Linux distributions.

John Locke
  • 153
  • 1
  • 9
  • Not 100% sure on this, but do you really need to copy the `ucontext_t` like that? Instead, just do `setcontext( (ucontext_t*) context);` in your handler (casting the `void*` to the right type and passing it on...). – twalberg Mar 27 '13 at 18:42
  • It's always safer to take the data that a pointer points to (when the original object is not needed or huge), especially in a situation where you might lose it. Also, due to lack of ideas and several hours of changes, I already tried every variation of that solution (and even retried it now just to make sure) - no go. – John Locke Mar 27 '13 at 19:02
  • In general, I agree on safety, but I did note the man page for `setcontext()` claims the `ucontext_t` *must* be one that came from `getcontext()`, `makecontext()`, or as the parameter passed to a signal handler, which the manual copy sorta bypasses. Not sure how doing a byte-wise copy of the structure might invalidate that, but figured it didn't hurt to ask... – twalberg Mar 27 '13 at 19:35
  • Curiously, this runs fine in valgrind, while gives a segfault for me too otherwise. – teppic Mar 27 '13 at 21:04

1 Answers1

1

At this point, my own research of the problem can be provided as a partial answer.

Firstly, I've found this article, which is old and does not quote any official sources of information: http://zwillow.blogspot.com/2007/04/linux-signal-handling-is-broken.html. This is a relevant citation:

Second problem: You can't use setcontext() to leave signal handler and jump into another, previously saved, context. (Or, for that matter, you can't use it to return to the very same context passed as argument to the signal handler.) In other words, signal handler like

static void sighandler(
   int signo, siginfo_t *psi, void *pv)
{
  memcpy(puc_old, pv, sizeof(ucontext_t));
  /* choose another context to dispatch */
  setcontext(puc_another);
}

does not work. It does not restore signal mask specified in the puc_other, does not reestablish alternate signal stack, etc. However, this scheme works flawlessly on Solaris.

If somebody can confirm the part about Solaris, it would be appreciated.

Secondly, after speaking with a university lecturer, I've come to understand that setting/swapping a context from a signal handler is not as straight forward as doing so in other situations. Sadly, the person, who explained this to me could not provide further details at the time.

Thus, both of my sources do not seem entirely reliable, but are clues nonetheless.

John Locke
  • 153
  • 1
  • 9