1

The waiting works fine with pidfd_open and poll.

The problem I’m facing, after the process quits, apparently the poll() API removes the information about the now dead process, so the waitid with P_PIDFD argument fails at once saying code 22 “Invalid argument”

I don’t think I can afford launching a thread for every child process to sleep on the blocking waitpid, I have multiple processes, and another handles which aren’t processes I need to poll efficiently.

Any workarounds?

If it matters, I only need to support Linux 5.13.12 and newer running on ARM64 and ARMv7 CPUs.

The approximate sequence of kernel calls is following:

  1. fork
  2. In the child: setresuid, setresgid, execvpe
  3. In the new child: printf, sleep, _exit
  4. Meanwhile in the parent: pidfd_open, poll, once completed waitid with P_PIDFD first argument.

Expected result: waitid should give me the exit code of the child.

Actual result: it does nothing and sets errno to EINVAL

Soonts
  • 20,079
  • 9
  • 57
  • 130
  • 2
    What works? What doesn't work? Please post a [mre]. BTW you can pass -1 to waitpid so you only need one. – n. m. could be an AI Sep 23 '21 at 15:52
  • Can you create a thread safe variable (i.e. boolean flag, eg `bool sf_running = true;) Set it to false when time is right, and test it in your monitoring thread? – ryyker Sep 23 '21 at 15:53
  • @n.1.8e9-where's-my-sharem. `poll` works, I’m notified when child quits. `waitid` fails so I’m unable to get the exit code. – Soonts Sep 23 '21 at 15:55
  • @ryyker I only have a single thread in each of the involved processes. – Soonts Sep 23 '21 at 15:57
  • waitpid can only fail if you are using it incorrectly. – n. m. could be an AI Sep 23 '21 at 15:57
  • We are all throwing darts in the dark without being able to see the relevant sections of code you are referring to. re-direct to comment ( `1` ) – ryyker Sep 23 '21 at 15:59
  • @ryyker It’s part of a much larger system which is not even in C. People normally calling these kernel APIs from C that’s why I put the tag. If I won’t get any answers and won’t solve that myself, will surely make a minimal repro in C. – Soonts Sep 23 '21 at 16:02
  • @ryyker Updated the question. Hopefully it’s now clear what am I doing. – Soonts Sep 23 '21 at 16:07
  • Could you post real small program to reproduce the problem? `approximate sequence of kernel calls is following:` Surely that it's not that much time to write such a program. – KamilCuk Sep 23 '21 at 16:07
  • I have written the program with the calls you presented, it took me 10 mins, maybe 20. I cannot reproduce - `poll` works, `waitid` returns with success and the child is terminated and child exit status is in `infop->si_status`. _Please post the source code_ [MCVE]. Most probably your call to `waitid` is invalid, and you just passed `0` as options argument, is that right? – KamilCuk Sep 23 '21 at 16:40
  • @KamilCuk Thanks for your help. I was not passing 0, but the only bit I was passing was `WNOHANG`. If you copy-paste the comment to an answer, I’ll happily accept. – Soonts Sep 23 '21 at 16:48

2 Answers2

2

There is one crucial bit. From man waitid:

Applications shall specify at least one of the flags WEXITED, WSTOPPED, or WCONTINUED to be OR'ed in with the options argument.

I was passing was WNOHANG

And you want to pass WNOHAND | WEXITED ;)

ti7
  • 16,375
  • 6
  • 40
  • 68
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • Thanks, it works. Mostly. Do you know a way to stop the kernel from destroying 24 out of the 32 bits in my exit codes? – Soonts Sep 23 '21 at 19:17
  • It’s not the libc it’s the OS kernel, I’m using this libc: https://git.musl-libc.org/cgit/musl/tree/src/process/waitid.c – Soonts Sep 23 '21 at 19:21
  • `24 out of the 32 bits in my exit codes?` Exit code has 8 bits, I do not understand. – KamilCuk Sep 23 '21 at 20:03
  • The `_exit` API function takes 32 bit integer as an argument. The `si_status` field deep inside the `siginfo_t` structure also contains 32 bits of data. Yet somewhere between these two, the Linux kernel is destroying my integers, truncating them to 8 bits. – Soonts Sep 23 '21 at 22:29
  • I can live without proper exit codes, emulating them somehow. I already have Unix domain sockets anyway. But still, if there’s an easy way to fix the kernel, I’d rather do that. – Soonts Sep 23 '21 at 22:31
  • `takes 32 bit intege` `int` is a type with _at least_ 16 bits, not exact 32 bits. And, those 32 bits, are truncated to 8 bits. Exit status has 8 bits. – KamilCuk Sep 23 '21 at 22:32
  • But anyway, you are right - looks like from https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html si_status should not be limited to 8 bits. – KamilCuk Sep 23 '21 at 22:35
  • 1
    https://stackoverflow.com/questions/50982730/how-to-get-the-full-returned-value-of-a-child-process Looks like linux is not POSIX compatible here and makes it 8 bits anyway – KamilCuk Sep 23 '21 at 22:41
0

You can use a single reaper thread, looping on waitpid(-1, &status, 0). Whenever it reaps a child process, it looks it up in the set of current child processes, handles possible notifications (semaphore or callback), and stores the exit status.

There is one notable situation that needs special consideration: the child process may exit before fork() returns in the parent process. This means it is possible for the reaper to see a child process exiting before the code that did the fork() manages to register the child process ID in any data structure. Thus, both the reaper and the fork() registering functions must be ready to look up or create the record in the data store keeping track of child processes; including calling the callback or posting the semaphore. It is not complicated at all, but unless you are used to thinking in asynchronous terms, it is easy to miss these corner cases.

Because wait(...)/waitpid(-1,...) returns immediately when there are no child processes to wait for (with -1 and errno set to ECHILD), the reaper thread should probably wait on a condition variable when there are no child processes to wait for, with the code that registers the child process ID signaling on that condition variable to minimize resource use in the no-child-processes case. (Also, do remember to minimize the reaper thread stack size, as it is unreasonably large (order of 8 MiB) by default, and wastes resources. I often use 2*PTHREAD_STACK_MIN, myself.)