4

I am trying to debug a complex Perl application that terminates with the error message "Signal SIGCHLD received, but no signal handler set". I know that it comes from the Perl interpreter itself, notably from the file mg.c and it cannot be caught. But I don't understand the exact nature of it.

Is this an internal error in the interpreter of the kind "should not happen"?

Or is there a (simple) way to reproduce this error with recent Perl versions?

I have already tried to reproduce it with hints given in https://lists.gnu.org/archive/html/bug-parallel/2016-10/msg00000.html by setting and unsetting a signal handler in an endless loop and constantly firing that signal in and endless loop from another script. But I could not reproduce the behavior described there with Perl versions 5.18.4, 5.26.0, 5.26.2, 5.28.2, and 5.30.2.

I have also found Signal SIGSTOP received, but no signal handler set in perl script where somebody assigned to $SIG{SIGSTOP} instead of $SIG{STOP} but that also does not help to make the problem reproducible with a simple script.

All of the perls I have tested were built without thread support:

$ perl -Mthreads
This Perl not built to support threads
Compilation failed in require.
BEGIN failed--compilation aborted.
Guido Flohr
  • 1,871
  • 15
  • 28
  • The problem does not just occur with SIGCHLD but also with SIGHUP or others. The question is what causes the error condition. In order to reproduce it in a small piece of code, I have tried to delete `SIG{CHLD}`, I have set it to undef, set it to something invalid, but nothing causes Perl to terminate like in the "real" application. And when I dumped the contents of `%SIG` before the error occurs, the error vanishes. – Guido Flohr Mar 16 '20 at 16:21
  • 1
    Can you run the real application under `gdb` and provide a backtrace? – Håkon Hægland Mar 16 '20 at 16:35
  • Are you only getting the error when using an older perl version? If so, they may have fixed an internal bug and upgrading is the solution. – Ted Lyngmo Mar 16 '20 at 17:05
  • @HåkonHægland unfortunately the server does not come up when running under gdb. I will try to reproduce it with less other stuff. – Guido Flohr Mar 17 '20 at 09:17
  • @Shawn I know how to handle signals. The problem is that the signal handling inside perl seems to be broken. Even if you never touch `%SIG`, the interpreter is not supposed to just exit your script. – Guido Flohr Mar 17 '20 at 09:18
  • @TedLyngmo can reproduce it with 5.18.4, 5.26.0, 5.26.2, 5.28.2, and 5.30.2. – Guido Flohr Mar 17 '20 at 09:19
  • @GuidoFlohr Oh, ok, then I misunderstood the "_I could not reproduce the behavior described there with Perl versions 5.18.4, 5.26.0, 5.26.2, 5.28.2, and 5.30.2_" part. – Ted Lyngmo Mar 17 '20 at 09:49
  • @TedLyngmo Actually, downgrading makes the error disappear in this case, see my own answer below. – Guido Flohr Mar 20 '20 at 06:00

2 Answers2

2

I am answering my own question here (to the best of my knowledge to date):

The error vanished by inserting these two lines:

$SIG{CHLD} ||= 'DEFAULT';
$SIG{HUP} ||= 'DEFAULT';

I would not call this a fix but rather a workaround because the value of "DEFAULT" should trigger the exact same behavior as no value resp. undef.

The error message is an internal error of Perl. The interpreter bails out here as a guard against signal handling bugs in Perl.

That being said, there is also no simple example that reproduces the error. And if there was one, it would be a bug in Perl.

A similar error was reported for GNU parallel some time ago: https://lists.gnu.org/archive/html/bug-parallel/2016-10/msg00000.html

The error reported there has a couple of things in common with the error that I encountered, notably that it occurred after fork()ing.

My application is a server based on Net::Server, and the error occurs, when a request handler is spawning a child process. Interestingly, the error message (and exit) happens before the child terminates.

The child process can potentially run for a very long time. Therefore it is made a session leader with setsid(), all open file descriptors are closed, and standard input is redirected to /dev/null, before exec() is being called. In other words, it is kind of daemonized.

It should also be noted that the error vanishes, when small modifications to the code are done, for example dumping the contents of %SIG for debugging purposes.

The error also did not occur, with Perl versions 5.8.9, 5.14.4, and 5.16.3. With 5.18.4, 5.26.2, and 5.30.2 it can always be reproduced. All of these executables had been built without interpreter thread support.

Guido Flohr
  • 1,871
  • 15
  • 28
-1

The "but no signal handler set" message is particular to threads, and signal handling works differently inside threads than it does in your main process.

The main process can receive a signal from any number of places -- a kill call in the program, a kill call from another Perl program, from the operating system, from a /usr/bin/kill command from the command line, etc. Perl has "default" signal handlers for all of the signals that can be sent from your system. You can trap some of them by setting up handlers in the global %SIG variable, though some signals (notable SIGKILL and SIGSTOP) cannot be trapped by your program.

Signals within threads are an emulation of signals in your parent operating system. They can only be sent by a call to threads::kill, which means they can only be signalled from within your Perl program. Perl does not setup any default signal handlers for thread signals, and warns you when an unhandled signal is given to a thread. And unlike the regular signal set, you may trip KILL and STOP signals to threads.

To be defensive, set a signal handler for all signals. Maybe something like:

use Carp;

sub thr_signal_handler {
    Carp::cluck("Received SIG$_[0] in ",threads->tid(),"\n";
}

# inside thread initialization function
$SIG{$_} = \&thr_signal_handler for keys %SIG;
...
mob
  • 117,087
  • 18
  • 149
  • 283
  • I am not using thread enabled perl executables. So I am not facing a threads issue. But I will still try to install explicit signal handlers for all signals except SIGKILL and SIGSTOP. – Guido Flohr Mar 17 '20 at 09:33
  • To be 100 % sure, I have added `use threads` to the script and it bails out immediately. So I am not using a threads enabled interpreter. – Guido Flohr Mar 17 '20 at 09:39