3

I have a bash script at work called the batch launcher. It is responsible for launching and reporting on the status of various batches. The launch of the batches is done like this:

env $BENV setsid $BATCH $OPT >> $FILE 2>&1

Several months ago I encountered an issue where a batch with forked children sent a kill signal on PGID, which ended up killing the batch AND the batch launcher, which is a big no-no. My solution to the problem was to add the setsid part (since bash does not allow setpgrp), which creates a new session and as a result - a new PGID.

Now I have another issue. That same pesky batch with forked children, except the children are not writing any logs until their job completion. After investigating the issue, I found the reason here - setsid spawns a process with no TTY, in which case any output from the children does not get autoflushed on every newline, but rather gets flushed once at the end right before stream closure. This means that if a child dies with an error, the logs in the buffer will be gone forever proving any debugging impossible.

Is there a way to tell setsid to autoflush STDOUT/STDERR output streams? Googling this has yielded me no results, and my only solution at this point is to rewrite the batch launcher in a language that supports setpgrp like PERL or C, which will require significant testing from QA.

EDIT:

Minor note, the issue can be resolved with:

env $BENV script -qec "$BATCH $OPT" -af $FILE

This requires util-linux older than 2010 to support -e flag for return code processing (which unfortunately is not the case for me with RHEL5 from 2005, see commit for details).

IDDQD
  • 3,543
  • 8
  • 29
  • 40

1 Answers1

1

Adding stdbuf -o0 (for no buffering, or stdbuf -oL for line buffering)

env $BENV setsid stdbuf -o0 $BATCH $OPT >> $FILE 2>&1

should fix the problem (as long as $BATCH buffers via libc's stdio and libc is linked in dynamically (default)).

Edit:

Here's a quick'n'dirty stdbuf -o0 emulation:

#!/bin/sh -e
trap 'rm -rf "$tmpd"' EXIT HUP INT QUIT TERM
tmpd=$(mktemp -d)
cat > "$tmpd/unbuf.c" <<EOF
#include <stdio.h>
__attribute__((constructor))
static void unbuffer(void){ setvbuf(stdout, 0, _IONBF, 0); }
EOF
gcc "$tmpd/unbuf.c" -o "$tmpd/unbuf.so" -fpic -shared
LD_PRELOAD="$tmpd/unbuf.so" "$@"
Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • 1
    Good point, but I'm currently stuck on RHEL5 with `coreutils 5.97-34.el5_8.1`. There's no `stdbuf` there. :( Is there any other utility, maybe something to set TTY? – IDDQD Jan 27 '17 at 18:59
  • 1
    @S.T.A.L.K.E.R. Made you a quick & dirty `stdbuf -o0` emulation. (Assumes you have gcc at least) – Petr Skocik Jan 27 '17 at 19:24
  • Thanks for the help! I've compiled the `unbuf.c` on the side, now I'm trying to launch the batch like this: `env LD_PRELOAD="$tmpd/unbuf.so" $BENV setsid $BATCH $OPT >> $FILE 2>&1`. But this does not help, I am still losing those logs. Do you see what I'm doing wrong? – IDDQD Jan 28 '17 at 00:49
  • Testing on this small slow writer (`int main() { uint i=0; for(;;){ printf("%u\n", i++); usleep(0.5*1000000); } } ` compiled to a.out and with unbuf.so in $PWD, ` env LD_PRELOAD=$PWD/unbuf.so setsid ./a.out > out` works for me (=allows me `tail -f out` and watch its contents in real time). Unfortunately if your $BATCH program doesn't use stdio or does something special with its output, you're out of luck unless it has it's own option to disable buffering. – Petr Skocik Jan 28 '17 at 08:20
  • The batch is written in PERL. I've found that I can resolve the issue by calling `STDOUT->flush();` in PERL manually, but I guess there's no generic solution that would solve the problem for ALL the languages (C, C++, Python, Perl, etc.) – IDDQD Jan 30 '17 at 14:03
  • Also, using special variable `$|` (http://perldoc.perl.org/perlvar.html) is an option. – IDDQD Jan 30 '17 at 14:14