3

I have a multi-processes program. To briefly illustrate the problem, the child process will be only blocked and the main process judge whether the child process still exists, if exists the parent process kill the child process.

My codes are as below:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <time.h>
#include <errno.h> 
#include <sys/socket.h> 
#include <string.h>

#define TIME_OUT 3 

int get_current_time()
{
    struct timespec t;
    clock_gettime(CLOCK_REALTIME, &t);
    return t.tv_sec;
}

void child_process_exec() 
{
    int fd = open("./myfifo", O_WRONLY); // myfifo is a named pipe, here will be blocked.
    sleep(10);
}

void parent_process_exec(pid_t workProcessId)
{
    int status;
    int childRes; 
    int lastHeartBeatTime = get_current_time();
    while(1) {
        sleep(1);
        if (get_current_time() - lastHeartBeatTime> TIME_OUT) {
            childRes = waitpid(workProcessId, &status, WNOHANG);
            if(childRes == 0) {  
                printf("kill process\n"); 
                printf("kill get %d\n", kill(workProcessId, SIGTERM));
            }
            
            workProcessId = fork();
            if(workProcessId > 0) {
                lastHeartBeatTime = get_current_time();
            } else {
                printf("start up child process again\n");
                child_process_exec();
                return;
            }
        }
    }
}

int main()
{
    pid_t workProcessId = fork();
    if (workProcessId > 0) {
        parent_process_exec(workProcessId);
    } else {
        child_process_exec();
    }
    
    return 0;
}

But I use ps get the child process is <defunct> in the terminal. Why is the child process a zombie after kill() it? How can I kill the child process cleanly?

Yongqi Z
  • 605
  • 8
  • 20
  • 2
    [What is a process, and why doesn't it get killed?](https://askubuntu.com/questions/201303/what-is-a-defunct-process-and-why-doesnt-it-get-killed) (First entry I get when doing a search for `linux ps ""`) – Some programmer dude Jul 03 '23 at 07:02
  • I have read this before I ask this question, but it didn't answer my doubts. In my codes, the child process is sleep or blocked, it is not dead. Why can I kill it ? – Yongqi Z Jul 03 '23 at 07:08
  • 1
    Did you read the second answer, not only the accepted answer (which is formulated in a confusing way IMO)? It quotes the `ps` manual page: "Processes marked **``** are dead processes (so-called "zombies") that remain because their parent has not destroyed them properly. These processes will be destroyed by `init(8)` if the parent process exits." So these "defunct" processes are simply child-processes that have ceased to run, but still haven't been *reaped* by the parent process. Unlike movie zombies, zombie processes can't be killed because they're already dead. – Some programmer dude Jul 03 '23 at 07:13
  • @TomKarzes I have modified, To briefly explain the problem, I put it to sleep. It was supposed to open a named pipeline in `write way`, and will be blocked – Yongqi Z Jul 03 '23 at 07:18
  • @Someprogrammerdude I did, `Processes marked are dead processes (so-called "zombies") that remain because their parent has not destroyed them properly.` That's exactly what I'm confused about, Since `kill(workProcessId, SIGTERM)`is not the correct way, what is the correct way? I don't want to leave a record, even if it's harmless. – Yongqi Z Jul 03 '23 at 07:23
  • There is no correct way to kill a zombie process, other than letting its parent process reap it with one of the `wait` functions. – Some programmer dude Jul 03 '23 at 08:04
  • 2
    So is your question now "how can I reap the zombie child process" or "why is the child process a zombie" or something else? – Useless Jul 03 '23 at 08:21
  • @Someprogrammerdude Sorry but I do not know what do mean, the `printf("start up child process again\n")` branch is aready behind `fork` and will run in a new child process, is not it? – Yongqi Z Jul 03 '23 at 08:55
  • Sorry, misread the code. – Some programmer dude Jul 03 '23 at 08:58
  • Child processes are zombies because the kernel is waiting for the parent process to execute a `wait()` system call to request their `exit()` value or the reason for which they were killed. You cannot `kill()` them because they're already dead KD. If you don't want to see them anymore, you need to `wait()` for them in the parent, or `kill()` or `exit()` the parent itself. Process accounting also depends on these zombie processes. – Luis Colorado Jul 06 '23 at 08:20

2 Answers2

0
  1. At t+3s you call waitpid(..., WNOHANG) which immidiately returns without reaping the child as is evident by childRes == 0. You kill the first child then overwrite workProcessId with pid of the 2nd child. Rinse and repeat. This means waitpid() is never called after a child has terminated, and at t=T you end up with T/3 zombie child processes. The easiest fix would be to change WNOHANG to 0 so parent blocks waiting for child. You would get similar effect by just using wait() to block waiting for any child.

    Alternatively, maintain an array of pid_t to hold each of the children that haven't been reaped then. Then loop that array with waithpid(..., WNOHANG).

  2. You probably want to fix the logic in parent_process_exec() so it doesn't unconditionally fork a new child.

  3. On Linux, I had to include signal.h for kill().

  4. Change int workProcessId to pid_t workProcessId.

  5. The 2nd argument to open() is an int not a string so you want to use O_WRONLY not "O_WRONLY". Always check return values.

Allan Wind
  • 23,068
  • 5
  • 28
  • 38
  • `The easiest fix would be to change WNOHANG to 0 so parent blocks waiting for child. You would get similar effect by just using wait() to block waiting for any child.` in fact, the child process hanles bussinesses and report heartbeats every 3 seconds. If it does not report for a long time, it may be dead or blocked, if it is blocked I kill it. I think `wait()` will block the parent process all the time if the child is blocked, so I can not use `wait()`. – Yongqi Z Jul 04 '23 at 00:42
  • To briefly explain the problem, I did not add the `heartbeat reports` codes. – Yongqi Z Jul 04 '23 at 00:43
  • The additional information changes the question. That said, I anticipated it, and you want to refer to the alternative answer in that case (keep an array of pids around then use a loop in the parent to non-blocking waitpid for each of the zombie processes). If I answered your original question please accept it by clicking the check mark next to it. – Allan Wind Jul 04 '23 at 03:00
0

According to this from @Useless, I add wait() after kill the child process, now parent reap the child process. Like this

if(childRes == 0) {  
    printf("kill process\n"); 
    printf("kill get %d\n", kill(workProcessId, SIGTERM));
    wait(NULL); // return the child process pid
}

I know a zombie process is just a pid, it is harmless, but I think there should be a method to kill child process. But there is a zombie left after the parent process kill its child, this really confuse me.

Yongqi Z
  • 605
  • 8
  • 20