25

The short question is, what should a shell do if it is in an orphaned process group that doesn't own the tty? But I recommend reading the long question because it's amusing.

Here is a fun and exciting way to turn your laptop into a portable space heater, using your favorite shell (unless you're one of those tcsh weirdos):

#include <unistd.h>   
int main(void) {
    if (fork() == 0) {
        execl("/bin/bash", "/bin/bash", NULL);
    }
    return 0;
}

This causes bash to peg the CPU at 100%. zsh and fish do the same, while ksh and tcsh mumble something about job control and then keel over, which is a bit better, but not much. Oh, and it's a platform agnostic offender: OS X and Linux are both affected.

My (potentially wrong) explanation is as follows: the child shell detects it is not in the foreground: tcgetpgrp(0) != getpgrp(). Therefore it tries to stop itself: killpg(getpgrp(), SIGTTIN). But its process group is orphaned, because its parent (the C program) was the leader and died, and SIGTTIN sent to an orphaned process group is just dropped (otherwise nothing could start it again). Therefore, the child shell is not stopped, but it's still in the background, so it does it all again, right away. Rinse and repeat.

My question is, how can a command line shell detect this scenario, and what is the right thing for it to do? My thought is that the shell tries to read from stdin, and just exits if read gives it EIO.

Thanks for your thoughts!

Edit: I tried doing a zero-length read() on /dev/tty, and that succeeded, which is bad. To get the EIO, I actually have to be prepared to read some data off of /dev/tty.

Edit: Another thought I had was to kill(getpgrp(), 0). If the process group is orphaned, then I believe this will always fail. However, it may also fail because I don't have permission to signal the session leader.

Edit: For anyone finding this later, what I ended up doing is described at https://github.com/fish-shell/fish-shell/issues/422 . Also, how's the future?

ridiculous_fish
  • 17,273
  • 1
  • 54
  • 61
  • 5
    +1 for the portable space heater remark. I giggled ;) – Damien Overeem Dec 05 '12 at 08:00
  • 2
    will best suite on http://unix.stackexchange.com/ – mtk Dec 10 '12 at 17:33
  • 1
    @Will How is this not a programming question? – Gilles 'SO- stop being evil' Dec 30 '12 at 00:56
  • Not sure what you expected here, but this seems legit. If you fork off infinite threads, you eventually consume all the CPU. A zero-length read is essentially a no-op. To the question in the title, a process without a tty that tries to read or write to a tty will behave poorly since its filehandle is no longer valid. Usually this means it will crash, but in some cases it can clobber something important. – saarp Jan 01 '13 at 01:30
  • @saarp - what infinite threads? There is no loop here. Just a single fork of a shell process that inexplicably consumes the CPU when its parent exits. – Mark Reed Jan 04 '13 at 14:14
  • As to the solution on the github thread, I don't like the bit about reading a byte from /dev/tty and dropping it on the floor. That's rude behavior for a process. – Mark Reed Jan 04 '13 at 14:16
  • @MarkReed - Hrm, I could have sworn there was a while in there somewhere. In any case, I don't think your explanation is correct because you can open up shells within shells just fine. Most shells probably share the same source code for OSX and Linux. You may just want to grab the source and build a debug version. My guess is it's something to do with being unable to open stdin/stdout/stderr. Not sure what application this has in the real world. – saarp Jan 04 '13 at 16:10
  • What explanation? I didn't explain anything. The OP is the author of a shell and trying to track down the source of this odd behavior, which also happens in less trivial, real-world situations, not just the reduced case here. – Mark Reed Jan 04 '13 at 16:32

1 Answers1

3

Here's what strace says is happening:

--- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
rt_sigaction(SIGTTIN, {SIG_IGN, [], SA_RESTORER, 0x7fd5f6989d80}, {SIG_DFL, [], SA_RESTORER, 0x7fd5f6989d80}, 8) = 0
ioctl(255, TIOCGPGRP, [9954])           = 0
rt_sigaction(SIGTTIN, {SIG_DFL, [], SA_RESTORER, 0x7fd5f6989d80}, {SIG_IGN, [], SA_RESTORER, 0x7fd5f6989d80}, 8) = 0
kill(0, SIGTTIN)                        = 0
--- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
rt_sigaction(SIGTTIN, {SIG_IGN, [], SA_RESTORER, 0x7fd5f6989d80}, {SIG_DFL, [], SA_RESTORER, 0x7fd5f6989d80}, 8) = 0
ioctl(255, TIOCGPGRP, [9954])           = 0
rt_sigaction(SIGTTIN, {SIG_DFL, [], SA_RESTORER, 0x7fd5f6989d80}, {SIG_IGN, [], SA_RESTORER, 0x7fd5f6989d80}, 8) = 0
kill(0, SIGTTIN)                        = 0
[repeat...]

and here is why, from jobs.c, bash 4.2:

  while ((terminal_pgrp = tcgetpgrp (shell_tty)) != -1)
    {
      if (shell_pgrp != terminal_pgrp)
        {
          SigHandler *ottin;

          ottin = set_signal_handler(SIGTTIN, SIG_DFL);
          kill (0, SIGTTIN);
          set_signal_handler (SIGTTIN, ottin);
          continue;
        } 
      break;
    } 

Concerning what to do about it...well that's beyond my ability. But, I thought this was useful information, and a bit much for a comment.

Phil Frost
  • 3,668
  • 21
  • 29