I am currently trying to figure out why a shell script fails at concurrent logging every once in a while.
I have a shell function like the following:
log()
{
local l_text=$1
local l_file="/path/to/logs/$(date +%Y%m%d)_script.log"
local l_line="$(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) ${l_text}"
echo ${l_line} >> ${l_file}
}
Now every once in a while this fails with a syntax error:
/path/to/script.sh: command substitution: line 163: syntax error near unexpected token `)'
/path/to/script.sh: command substitution: line 163: `hostname -s) ${l_text}'
The problem is, that I have multiple sub-processes, which each want to log as well as send traps (during which logging is performed as well). I haved debugged the problem and found out, that this happens, when the function is entered three times simultaneously. First the main
process enters, then the child
. After the date
part of l_text
is executed, main
get's interrupted by a trap
which is caused by child
and in this trap
tries to log something. The child
and the trap
finish their logging nicely, but then main
is resumed after the trap and tries to execute the hostname
part (presumedly) and fails with this error.
So it seems like main
does not like being put to sleep while it is producing the $(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) ${l_text}
part of the log statement and cannot resume nicely. I was assuming this should work fine, because I am just using local variables and thread safe output methods.
Is this a general concurrency problem I am running into here? Or is this very specific for the trap mechanism in bash scripts? I know about the commodities of SIGNAL handling in C, so I am aware that only certain operations are allowed in SIGNAL handlers. However I am not aware if the same precautions also apply when handling SIGNALs in a bash script. I tried to find documentation on this, but none of the documents I could find gave any indications of problems with SIGNAL handling in scripts.
EDIT:
Here is an actuall simple script that can be used to replicate the problem:
#!/bin/bash
log() {
local text="$(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) $1"
echo $text >> /dev/null
}
sub_process() {
while true; do
log "Thread is running"
kill -ALRM $$
sleep 1
done
}
trap "log 'received ALRM'" ALRM
sub_process &
sub_process_pid=$!
trap "kill ${sub_process_pid}; exit 0" INT TERM
while true; do
log "Main is running"
sleep 1
done
Every once in a while this script will get killed because of a syntax error in line 5. Line 5 is echo $text >> /dev/null
, but since the syntax error also mentiones the hostname command, similar to the one I posted above, I am assuming there is an of-by-one error as well and the actual error is in line 4, which is local text="$(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) $1"
.
Does anybody know what to do with the above script to correct it? I alread tried moving out the construction of the string into some temporary variables:
log() {
local thedate=$(date +'%Y-%m-%d %H:%M:%S')
local thehostname=$(hostname -s)
local text="${thedate} ${thehostname} $1"
echo $text >> /dev/null
}
This way the error appears less frequently, but it still is present, so this is not a real fix.