4

How does re-parenting of stopped process heppens? Why does stopped process just terminates after re-parenting?

More precisely, suppose I have a code like this

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h> 
#include <sys/syscall.h>
#include <stdio.h>


int main(void) {
    pid_t child;

    child = fork();

    if (child == 0) {
        int t = 0;
        while (true) {
            printf("%d. I'm %d, my parent is %d\n", t++, getpid(), getppid());
            sleep(1);
        }
    } else {
        printf("I'm the parent. My pid is %d\n", getpid());
        printf("Starting to wait for 30 seconds\n");
        sleep(30);
        printf("Done waiting, aborting\n");
    }
}

When I run this code the child process works and parent process just sleeps. After 30 seconds passed the parent process terminates and the child process now becomes a child of init and just continues running. Everything is normal.

But if I run this code and in first 30 seconds of it's execution I also run

kill -SIGSTOP <child_pid>

Then the child process stops (T state in ps xaf) and the parent process sleeps. After 30 second passed the parent process returns from sleep and just terminates (as it reached the end of main) but the child process instead of being re-parented to init in stopped state just terminates. I don't see it in ps xaf and if run lastcomm I see this output:

a.out             F  X equi     pts/5      0.00 secs Wed Mar 16 17:44

Why is this happening that stopped process dies after re-parenting? Is it possible in linux to re-parrent stopped process?

PepeHands
  • 1,368
  • 5
  • 20
  • 36
  • Why do you want to do this? – nzc Mar 16 '16 at 15:05
  • 1
    @nzc there is such thing as `criu` (see criu.org or https://github.com/xemul/criu) I want to add feature `--leave-stopped` in `criu restore` (it's currently only available with `criu dump`) – PepeHands Mar 16 '16 at 15:09
  • I think you might get better results if you include some of those details in your question. I think (but am not certain) that part of the issue is that you may be using job control signals in ways they weren't intended to be used. Including the context for why you want to use them might help someone who does know, to explain why and perhaps suggest the correct approach. – nzc Mar 16 '16 at 15:16

1 Answers1

3

From http://www.gnu.org/software/libc/manual/html_node/Job-Control-Signals.html

When a process in an orphaned process group (see Orphaned Process Groups) receives a SIGTSTP, SIGTTIN, or SIGTTOU signal and does not handle it, the process does not stop. Stopping the process would probably not be very useful, since there is no shell program that will notice it stop and allow the user to continue it. What happens instead depends on the operating system you are using. Some systems may do nothing; others may deliver another signal instead, such as SIGKILL or SIGHUP. On GNU/Hurd systems, the process dies with SIGKILL; this avoids the problem of many stopped, orphaned processes lying around the system.

See also: What's the difference between SIGSTOP and SIGTSTP?

Community
  • 1
  • 1
nzc
  • 1,576
  • 1
  • 14
  • 24
  • Actually I don't understand how could `SIGTSTP`, `SIGTTIN` or `SIGTTOU` occur here. In document you provided writen "When a process group becomes an orphan, its processes are sent a SIGHUP signal". Does this signal cause the termination of my child program? If yes, why this signal does not terminate my program in case it's not stopped? – PepeHands Mar 16 '16 at 15:24
  • 1
    I think the real reason here is, "stopped" is a state that has meaning only if the process has "job control", because then it can later be restarted. I think that when the process loses job control, the system sends it a SIGHUP because otherwise it would just be stopped forever. See the last line "... this avoids the problem of many stopped, orphaned processes lying around the system." – nzc Mar 17 '16 at 20:28
  • Also, re-reading your comment, there is no "SIGHUP" sent when your process is not stopped. The SIGHUP is sent because your process *is* stopped when it becomes an orphaned process. – nzc Mar 23 '16 at 13:31