1

I am trying to implement some sort of self-watching with C++14 in a process on embedded linux. If process A is started, it starts an additional process B from same image, with posix_spawn and setsid, if he is the only one running so far. After that, process A starts to work. This additional process B detects that there are two processes A+B running from the respective image and waits until A crashes. In case of a crash, the additional process B leaves the wait, starts a new process C for monitoring himself, and takes over the work of the crashed A and so on. That works pretty good if I start process A directly in a shell, or from a shell script. Doing the initial start-up from a cron job works at first, but the crash of process A kills the waiting child B as well. Is there any way to prevent this? So far, I tried various versions of adding & or nohup to the cron job script, but nothing helped. I need B to survive the crash of A, if B has been started by A, and A has been started by a cron job (doing this w/o cron is not an option, also, doing this with two separately started processes won't work in our scenario).

Gernot
  • 11
  • 2
  • probably this is relevant - check for setsid() - https://stackoverflow.com/a/46621604/12396017 - if you find the answer to your issue then please write down detailed answer to your own question – Maxim Sagaydachny Dec 18 '19 at 14:56
  • If you don't mind using the parent process as watcher you could just use `fork()` & `waitpid()` to monitor the child doing the work, [heres a small example](https://onlinegdb.com/rkFrbCDRr) – Turtlefight Dec 18 '19 at 16:19
  • First of all, thanks for your comments. I already use setsid(), and, letting the parent process watch did not work since a crashing child became defunct. However, I made some progress in analysing the root cause: Seems that when starting from cron job the environment variables get corrupted.First process A has good ones and it can start B, B itself has corrupted ones and crashes while starting the next process C. I tried with hard-coded variables and that worked fine. Since I dont think this can be a final solution - any ideas how to prevent this corruption? – Gernot Dec 19 '19 at 16:12

0 Answers0