3

I'm seeing some unexpected behavior with runit and not sure how to get it to do what I want without throwing an error during termination. I have a process that sometimes knows it should stop itself and not let itself be restarted (thus should call sv d on itself). This works if I never change the user but produces errors if I switch to a non-root user when running.

I'll use the same finish script for both examples:

#!/bin/bash -e
echo "downtest finished with exit code $1 and exit status $2"

The run script that works as expected (prints downtest finished with exit code 0 and exit status 0 to syslog):

#!/bin/bash -e
exec 2>&1
echo "running downtest"
sv d downtest
exit 0

The run script that doesn't work as expected (prints downtest finished with exit code -1 and exit status 15 to syslog):

#!/bin/bash -e
exec 2>&1
echo "running downtest"
chpst -u ubuntu sudo sv d downtest
exit 0

I get the same result if I use su ubuntu instead of chpst.

Any ideas on why I see this behavior and how to fix it so calling sudo sv d downtest results in a clean process exit rather than returning error status codes?

Gordon Seidoh Worley
  • 7,839
  • 6
  • 45
  • 82
  • By the way -- `set -e` is prone to... unexpected consequences. See the exercises in [BashFAQ #105](http://mywiki.wooledge.org/BashFAQ/105#Exercises), and the comparison of different shells' behavior at https://www.in-ulm.de/~mascheck/various/set-e/ – Charles Duffy Jul 27 '18 at 19:44

2 Answers2

2

sv d sends a SIGTERM if the process is still running. This is signal 15, hence the error being handled in the manner in question.

By contrast, to tell a running program not to start up again after it exits on its own (thus allowing that opportunity), use sv o (once) instead.

Alternately, you can trap SIGTERM in your script when you're expecting it:

trap 'exit 0' TERM

If you want to make this conditional:

trap 'if [[ $ignore_sigterm ]]; then exit 0; fi' TERM

...and then run

ignore_sigterm=1

before triggering sv d.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • So this technically works but it ends up masking the exit status of the script. I'd still like to know if the run script exits with an error, I just would like it to not exit with an error when doing `sv d` – Gordon Seidoh Worley Jul 30 '18 at 18:11
  • I wouldn't expect this to mask exit status **except** when terminated by SIGTERM, as triggered by `sv d` or `sv t` or `kill -TERM`. Other cases should exit normally. – Charles Duffy Jul 30 '18 at 19:27
  • ...that said, you can easily set a flag to indicate when you want to mask the status; amending appropriately. – Charles Duffy Jul 30 '18 at 19:28
  • hmm, good point. i guess too i should expect in reality for the trap to only matter in this case or in cases where i already have a trap: normally runit scripts end with an `exec` which generally means if you tried to trap a signal in the calling script it wouldn't hit the trap anyway since the script is busy doing `exec` and then will exit straight from the exec. – Gordon Seidoh Worley Jul 30 '18 at 20:32
  • I wouldn't say the script is "busy doing `exec`"; once the `execve` syscall is successfully invoked, the shell is replaced in-memory with the thing it's invoking (which takes over the process-table entry), so any signals sent to that PID will go straight to the invoked program, not to the shell (which is no longer present at all). – Charles Duffy Jul 30 '18 at 20:56
  • The thing I still don’t get is why this only happens when using a non root user which makes me thing there could be another issue – nbari Jul 31 '18 at 04:17
  • 1
    `chpst -u sudo` is a horrible practice, so it's not something I have much motive to think about; if making an effort to follow best practices, it's hard not to encounter "don't do that". (This is part of why I've avoided quoting that code in my answer, to prevent any implicit endorsement). That said, if you want to track down the chain of events, `sysdig` is your friend. – Charles Duffy Jul 31 '18 at 12:34
1

Has a workaround try a subshell for running (chpst -u ubuntu sudo sv d downtest) that will help to allow calling the last exit 0 since now is not being called because is exiting before.

#!/bin/sh
exec 2>&1
echo "running downtest"
(sudo sv d downtest)
exit 0

Indeed, for stopping the process you don’t need chpst -u ubuntu if want to stop or control the service as another user just need to adjust the permissions to the ./supervise directory that’s why probably you are getting the exit code -1

Checking the runsv man:

Two arguments are given to ./finish. The first one is ./run’s exit code, or -1 if ./run didn’t exit normally. The second one is the least significant byte of the exit status as determined by waitpid(2); for instance it is 0 if ./run exited normally, and the signal number if ./run was terminated by a signal. If runsv cannot start ./run for some reason, the exit code is 111 and the status is 0.

And from the faq:


Is it possible to allow a user other than root to control a service Using the sv program to control a service, or query its status informations, only works as root. Is it possible to allow non-root users to control a service too?

Answer: Yes, you simply need to adjust file system permissions for the ./supervise/ subdirectory in the service directory. E.g.: to allow the user burdon to control the service dhcp, change to the dhcp service directory, and do

# chmod 755 ./supervise
# chown burdon ./supervise/ok ./supervise/control ./supervise/status

In case you would like to full stop/start you could remove the symlink of your run service, but that will imply to create it again when you want the service up.

Just in case, because of this and other cases, I came up with immortal to simplify the stop/start/restart/retries of services without root privileges, full based on daemontools & runit just adapted to some new flows.

nbari
  • 25,603
  • 10
  • 76
  • 131
  • Hmm, tried it but unfortunately still produces the same result. – Gordon Seidoh Worley Jul 30 '18 at 18:09
  • @GGordonWorleyIII what happens if you just `exit 0` putting it before `(chpst...` ? If that works try https://superuser.com/a/363454/284722 to capture the exit code – nbari Jul 30 '18 at 18:27
  • nope, this puts the service in an infinite loop where it constantly dies and then runit tries to restart it since the `sv d` line never gets executed because it comes after exit 0 which exits. i like the idea that it might be a race condition, which would also help explain why the trap works and why forking could have worked, but it didn't. – Gordon Seidoh Worley Jul 30 '18 at 19:00
  • @GGordonWorleyIII ok at least you can exit but the sub shell should be catching the exit and then allow you to exit with 0 in your case , what about just using `sudo sv d downtest`, you don’t really need `chpst` – nbari Jul 30 '18 at 19:11
  • i see the same behavior with `sudo`. – Gordon Seidoh Worley Jul 30 '18 at 20:28
  • 1
    Providing a better-practice alternative definitely makes this a much stronger answer -- if it *only* provided the new (`supervise/*` and/or `finish`) approaches, it would absolutely have my +1. – Charles Duffy Jul 31 '18 at 12:36
  • @CharlesDuffy you mean to have something like `finish` in `immortal`? – nbari Jul 31 '18 at 12:41