0

I have a long-running bash script (like a daemon), which may emit output to both stdout and stderr.

I have these requirements:

  • prepend every output line with a string (e.g. date, time of the log entry and other info)
  • redirect all output into a log file
  • output redirection must be real-time (i.e. log file must be filled while process is running)
  • the process needs to run in background

This is what I've achieved so far:

myscript="script_name"
log_file="/tmp/blabla"

prepend_info() {
    xargs -d '\n' printf "$(date +'%b %d %T') $HOSTNAME $myscript: %s\n"
}

script -q -c "$myscript" /dev/null 2>&1 | prepend_info > "$log_file" 2>&1 &

The problem is that the log file gets filled only after my script has terminated, but I want to see output while it's running instead.

If I remove |& prepend_info it works as expected, but I need that additional info into the log file as well.

It seems like the pipe only gets executed after the first command terminates.

Is there some way to modify the output of a background script and redirect it into a file while it's running?

I need to be as compatible as possible, and I can only use simple bash commands. For example, I cannot use ts because it's not always available and also I don't need only the timestamp but other info as well.

UPDATE: The only solution I found so far (it solves everything, also the issue with date) is the following.

myscript="script_name"
log_file="/tmp/blabla"

exec_script() {
    rm -f "$log_file"
    local out_log=<($myscript 2>&1)
    while read -r line; do
        echo "$(date +'%b %d %T') $HOSTNAME $myscript: $line" >> $log_file
    done < "$out_log"
}

exec_script &

If anyone has a better solution, I'm all ears.

gentooise
  • 416
  • 3
  • 8
  • `$myscript=` is invalid. – KamilCuk Nov 18 '22 at 13:19
  • Just a typo while trying to reduce the snippet to the bare minimum, fixed it. – gentooise Nov 18 '22 at 13:48
  • 1
    Have you tried researching? Like https://unix.stackexchange.com/questions/26728/prepending-a-timestamp-to-each-line-of-output-from-a-command https://unix.stackexchange.com/questions/592215/best-way-to-prefix-output-of-a-command-and-check-its-return-value . Check your script with shellcheck , and you might want to read https://wiki.bash-hackers.org/scripting/obsolete . `It seems like the pipe only gets executed` https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe https://stackoverflow.com/questions/51452321/how-does-buffer-in-bash-pipe-work-on-linux – KamilCuk Nov 18 '22 at 14:00
  • @KamilCuk his way of prepending a **string constant** to each line of the output isn't so horrible; I would use `| xargs -0 printf "$(date +'%b %d %T') $(hostname) $myscript: %s\n"` though – Fravadona Nov 18 '22 at 14:10
  • Yes, I did research @KamilCuk, but I would like to understand where my snippet is wrong (apart from cleaning up obsolete syntax, ok). If piped commands are run in parallel (as your last link says), why am I not seeing the output inside log file in real-time? Also tried `stdbuf -oL -eL $myscript 2>&1 | prepend_info 2>&1 > $log_file &` but no luck: same behavior, log file gets filled only after my script terminates. – gentooise Nov 18 '22 at 14:14
  • Because https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe `Connection a pipe to stdout also activates a 4K buffer` `ut no luck:` So what does that `tr .. |` do there at all, and _tr is also in a pipeline_. – KamilCuk Nov 18 '22 at 14:18
  • Your last comment is not clear to me. I understand that a 4K buffer is activated, but this information is not useful unless there is something that works. I tried what they are suggesting there, namely: `stdbuf -oL -eL $myscript 2>&1 | prepend_info 2>&1 > $log_file &` or `script -q -c "$myscript" /dev/null 2>&1 | prepend_info 2>&1 > $log_file &` but they don't work, log file still gets filled after script terminates. I need a solution which is widely available on all Unix-like platforms, like the one using `script` (if it just worked...) – gentooise Nov 18 '22 at 14:29
  • Ok, I understand what you mean, I also need to apply the same to the `tr | xargs` pipeline, but I would prefer to avoid that pipeline at this point. Any idea of how to replace `tr | xargs` with a single command? That pipeline comes from here https://stackoverflow.com/questions/21098382/how-to-add-timestamp-while-redirecting-stdout-to-file-in-bash#comment70835796_21107582 – gentooise Nov 18 '22 at 14:39
  • What version of bash do you target? – Fravadona Nov 18 '22 at 15:32
  • Unfortunately I don't have a precise requirement, I just need to be as compatible as possible. The one I'm currently trying is `4.1.2(2)-release`, and I know for sure I must support this one at least. Let's say "`4.1.2` and above" is a good approximation. – gentooise Nov 18 '22 at 15:38

2 Answers2

1

I would like to understand where my snippet is wrong

Remember to check your scripts with shellcheck. $myscript and $log_file are not quoted. Also see https://wiki.bash-hackers.org/scripting/obsolete - you are using function NAME(), just use NAME(), and both |& and >& shouldn't be used.

Commands in pipeline are fully buffered, which in your case flushes the output on the command termination, as the buffer is big enough. Both script_name and tr are fully buffered, which gives a big buffer before your command reaches xargs at all.

tr '\n' '\0' | xargs -0 just sounds very strange. Why change newline for zero to use zero? Just use newline xargs -d '\n' from the start.

Additionally command substitution expands before running xargs, so $(date +'%b %d %T') will expand once to the date before xargs has started, not the time the current line has been written.

In Bash, $HOSTNAME is faster than $(hostname).

Moreover, tr output is always fully buffered, as tr program itself uses an algorithm with full buffering. Sooo.. do not use tr.

Is there some way to modify the output of a background script and redirect it into a file while it's running?

There are many, many, many solutions available on the net on how to prefix command output, like https://unix.stackexchange.com/questions/26728/prepending-a-timestamp-to-each-line-of-output-from-a-command , https://unix.stackexchange.com/questions/592215/best-way-to-prefix-output-of-a-command-and-check-its-return-value .

Subjectively, the best is not to use shell, and write a python script that will subprocess.popen the command and implement what you want by reading from file descriptors. That way the performance will be acceptable.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • I'm avoiding `xargs -d` because of this answer: https://stackoverflow.com/a/28806991/2043589, I need to be as compatible as possible, and I can only use simple bash commands. For example, I cannot use `ts` because it's not always available and also I don't need only the timestamp but other info. – gentooise Nov 18 '22 at 14:44
  • How can I overcome the issue with the date command being substituted only once? – gentooise Nov 18 '22 at 14:51
  • I updated the snippet in my question with all your suggestions, but it still doesn't work. – gentooise Nov 18 '22 at 15:01
  • I do not understand. How https://unix.stackexchange.com/a/26729/209955 doesn't answer your question? `How can I overcome the issue with the date command being substituted only once?` Do not use xargs, write a normal loop. – KamilCuk Nov 18 '22 at 18:02
  • The solution with the while works fine, but I cannot understand why the first snippet in my question still does not (the one with `script -c` and `xargs -d`). I still see the log only after the command terminates, but the only pipe left should not be buffered anymore... Neither `script` nor `stdbuf` seem to have any effect. – gentooise Nov 18 '22 at 22:46
  • Because xargs joins all arguments, you want `xargs -n 1` – KamilCuk Nov 19 '22 at 08:25
  • Ok, now I understand, it works. However, I still need to invoke a new shell with `xargs` every time in order to do `date` command substitution once per line... so I would prefer the solution with the explicit while. Thanks – gentooise Nov 19 '22 at 14:51
1

With bash 4.2+ you can avoid forking a date command per line of input with printf '%( ... )T' and the global variable SECONDS

#!/bin/bash

myscript="script_name"
log_file="/tmp/blabla"

SECONDS=$(command -p awk 'BEGIN{srand();print srand()}')

{
    "$my_script" 2>&1 |
    while IFS='' read -r line
    do
        printf '%(%b %d %T)T %s %s: %s\n' "$SECONDS" "$HOSTNAME" "$myscript" "$line"
    done > "$log_file" 2>&1
} &
Fravadona
  • 13,917
  • 1
  • 23
  • 35