2

I'm trying to improve the startup scripts for several servers running in a cluster environment. The server processes should run indefinitely but occasionally fails on startup issuing e.g., Address already in use exceptions.

I'd like the exit code for the startup script to reflect these early terminations by, say, waiting for 1 second and telling me if the server seems to have started okay. I also need the server PID echoed.

Here's my best shot so far:

$ cat startup.sh
# start the server in the bg but if it fails in the first second, 
# then kill startup.sh.

CMD="start_server -option1 foo -option2 bar"
eval "($CMD >> cc.log 2>&1 || kill -9 $$ &)"
SERVER_PID=$!

# the `kill` above only has 1 second to kill me-- otherwise my exit code is 0
sleep 1
echo $SERVER_PID

The exit code works fine but two problems remain:

  1. If the server is long-running but eventually encounters an error, the parent startup.sh will have exited already and the $$ PID may have been reused by an unrelated process which this script will then kill off.

  2. The SERVER_PID isn't correct since it's the PID of the subshell rather than the start_server command (which in this case is a grandchild of the startup.sh script.

Is there a simpler way to background the start_server process, get its PID, and use a timeout'ed check for error codes? I looked into bash builtins wait and timeout but they don't seem to work for processes that shouldn't exit in the end.

I can't change the server code and the startup script should not run indefinitely.

Jake Biesinger
  • 5,538
  • 2
  • 23
  • 25

2 Answers2

1

You can also use coproc (and look, I'm putting the command in an array, and also with proper quoting!):

#!/bin/bash
cmd=( start_server -option1 foo -option2 bar )
coproc mycoprocfd { "${cmd[@]}" >> cc.log 2>&1 ; }
server_pid=$!
sleep 1
if [[ -z "${mycoprocfd[@]}" ]]; then
    echo >&2 "Failure detected when starting server! Server died before 1 second."
    exit 1
else
    echo $server_pid
fi

The trick is that coproc puts the file descriptors of the redirections of stdin and stdout in a prescribed array (here mycoprocfd) and empties the array when the process exits. So you don't need to do clumsy stuff with the PID itself.

You can hence check for the server to never exit as so:

#!/bin/bash
cmd=( start_server -option1 foo -option2 bar )
coproc mycoprocfd { "${cmd[@]}" >> cc.log 2>&1 ; }
server_pid=$!
read -u "${mycoprocfd[0]}"
echo >&2 "Oh dear, the server with PID $server_pid died after $SECONDS seconds."
exit 1

That's because read will read on the file descriptor given by coproc (but nothing is ever read here, since the stdout of your command has been redirected to a file!), and read exits when the file descriptor is closed, i.e., when the command launched by coproc exits.

I'd say this is a really elegant solution!

Now, this script will live as long as the coproc lives. I understood that's not what you want. In this case, you can timeout the read with its -t option, and then you'll use the fact that return's exit status is greater than 128 if it timed out. E.g., for a 4.5 seconds timeout

#!/bin/bash
timeout=4.5
cmd=( start_server -option1 foo -option2 bar )
coproc mycoprocfd { "${cmd[@]}" >> cc.log 2>&1 ; }
server_pid=$!
read -t $timeout -u "${mycoprocfd[0]}"
if (($?>128)); then
    echo "$server_pid <-- all is good, it's still alive after $timeout seconds."
else
    echo >&2 "Oh dear, the server with PID $server_pid died after $timeout seconds."
    exit 1
fi
exit 0 # Yay

This is also very elegant :).

Use, extend, and adapt to your needs! (but with good practices!)

Hope this helps!

Remarks.

  • coproc is a bash-builtin that appeared in bash 4.0. The solutions shown here are 100% pure bash (except the first one, with sleep, which is not the best one at all!).
  • The use of coproc in scripts is almost always superior to putting jobs in background with & and doing clumsy and awkward stuff with sleep and checking $!.
  • If you want coproc to keep quiet, whatever happens (e.g., if there's an error launching the command, which is fine here since you're handling everything yourself), do:

    coproc mycoprocfd { "${cmd[@]}" >> cc.log 2>&1 ; } > /dev/null 2>&1
    
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
  • Great answer. I hadn't seen `coproc` before and would have been lost in its usage if I had. One concern is the `bash` version-- this may be running on some very old clusters with bash 3.x. Is there a workaround in that space? – Jake Biesinger Oct 30 '13 at 17:41
  • about coproc and array initialization: will `cmd=( "path with spaces/start_server" -option1 foo -option2 bar )` be interpreted correctly by coproc? Also, why the spaces around the `(` and `)` in array initialization? Just aesthetics? – Jake Biesinger Oct 30 '13 at 18:05
  • @JakeBiesinger Regarding old versions of bash and coproc... I can't really help you, just try to make up something not too awkward with `$!`, `sleep`, `kill`, etc. as you already have. Regarding arrays: `cmd=( "path with spaces/start_server" -option1 foo -option2 bar )` will work (you have to proper quotings!) and that's one great things about arrays! the spaces around the parentheses are just here for cosmetic reasons (arguably pretty). – gniourf_gniourf Oct 30 '13 at 18:23
  • slick... `coproc` indeed interprets that first array value as a single argument, even though it's being expanded inside `coproc { "${cmd[@]}" ; }`. You've made a convert out of me `:)` – Jake Biesinger Oct 30 '13 at 20:42
0

20 minutes of more googling revealed https://stackoverflow.com/a/6756971/494983 and kill -0 $PID from https://stackoverflow.com/a/14296353/494983.

So it seems I can use:

$ cat startup.sh   
CMD="start_server -option1 foo -option2 bar"
eval "$CMD >> cc.log 2>&1 &"
SERVER_PID=$!
sleep 1
kill -0 $SERVER_PID
if [ $? != 0 ]; then
    echo "Failure detected when starting server! PID $SERVER_PID doesn't exist!" 1>&2
    exit 1
else
    echo $SERVER_PID
fi

This wouldn't work for processes that I can't send signals to but works well enough in my case (where startup.sh starts the server itself).

Community
  • 1
  • 1
Jake Biesinger
  • 5,538
  • 2
  • 23
  • 25
  • Good practices: do not use uppercase variable names; do not put commands in variables, but put them in _arrays_. Oh, and `eval` is evil `:D`. – gniourf_gniourf Oct 30 '13 at 09:08
  • Great feedback and a great answer about `coproc`. I had no idea bash arrays were the preferred holders of arguments (guess it makes sense why they have space-separated initializers). Thanks! – Jake Biesinger Oct 30 '13 at 17:39