0

I've developed a bash script which is causing an anomalous memory usage when looping over a great number of files. After some time, all memory is exhausted and system starts swapping, so that it becomes unusable. After having put some sentinels around the code, I think that function which is causing the issue is _do_cmd(), which is included in the following simplified script.

#!/bin/bash

WORKINGDIR=$(dirname "$0")
SCRIPT=$(basename $0)
LOGFILE=$WORKINGDIR/test-log.txt
FILELIST=$WORKINGDIR/file.list
INDIR=/media/data/incoming
OUTDIR=$WORKINGDIR/Foto/copied

_log() {
  echo -e "[$(date +"%Y-%m-%d %H:%M:%S")]: $*" >> $LOGFILE
}

_do_cmd() {
  local LOGFILE_TMP="$LOGFILE.tmp"
  exec 2> >(tee -a $LOGFILE_TMP)
  _log " ACTION: $@"
  "$@"
  ret=$?
  if [[ $ret -ne 0 ]]; then
      _log "ERROR: Return code $ret"
      grep -v "frame= " "$LOGFILE_TMP" >> $LOGFILE
      rm -f $LOGFILE_TMP
      exit $ret
  fi
  if [ -f $LOGFILE_TMP ]; then rm $LOGFILE_TMP; fi
}

while read F
do
  echo "Before: $(ps -ef | grep $SCRIPT | wc -l)"
  FILE=$(basename $F)
  _do_cmd cp "$INDIR/$FILE" "$OUTDIR"
  echo "After: $(ps -ef | grep $SCRIPT | wc -l)"
done < $FILELIST

When I run the script, I see an output like the following one:

$ ./test-mem.sh
Before: 3
After: 4
Before: 4
After: 5
Before: 5
After: 6
Before: 6
After: 7
Before: 7
After: 8
Before: 8
After: 9
Before: 9
After: 10
Before: 10
After: 11
Before: 11
After: 12
Before: 12
After: 13
Before: 13
After: 14
Before: 14
After: 15
Before: 15
After: 16
Before: 16
After: 17
Before: 17
After: 18
Before: 18
After: 19
Before: 19
After: 20
Before: 20
After: 21
^C

Looking at running processes during the execution, I find that number of instances of my script constantly grows during the execution:

$ watch -n 1 "ps -ef | grep test-mem.sh"
Every 1,0s: ps -ef | grep test-mem.sh                                                          Wed Apr  4 10:23:32 2018

user     4117  4104  0 10:23 pts/1    00:00:00 watch -n 1 ps -ef | grep test-mem.sh
user     4877  1309 11 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4885  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4899  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4913  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4927  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4941  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4955  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4969  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4983  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     4997  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5011  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5025  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5043  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5057  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5071  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5085  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5099  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5113  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5127  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5141  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5155  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5169  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5183  4877  0 10:23 pts/0    00:00:00 /bin/bash ./test-mem.sh
user     5304  4117  0 10:23 pts/1    00:00:00 watch -n 1 ps -ef | grep test-mem.sh
user     5305  5304  0 10:23 pts/1    00:00:00 sh -c ps -ef | grep test-mem.sh
user     5307  5305  0 10:23 pts/1    00:00:00 grep test-mem.sh

Function _do_cmd() has the purpose of running a command, capturing its error output and, only in case of an error, store it to the log file and exit to shell.

Can anybody help me to understand why after every _do_cmd execution I have a new instance of test-mem.sh running in the system?

Thanks in advance.

liv913
  • 117
  • 9
  • 2
    Which is it, a memory leak or zombie processes? How are you calling this function? – melpomene Apr 03 '18 at 13:30
  • just some ideas here, change `exit` to `return`? AND you **want** to leave your $TMP file in place if there is an error? Good luck. – shellter Apr 03 '18 at 13:31
  • Please read [mcve]. The code you posted here is incomplete; it's likely not minimal (is every single line in there required to reproduce the problem?); and there's no clear problem description. – melpomene Apr 03 '18 at 13:32
  • Just a guess but a parenthesis '()' open a subshell and tee reads input continuously. So try without them. – NaN Apr 04 '18 at 07:08
  • You're right, sorry for my poor explanation. I reviewed my question, I hope this makes more sense now. – liv913 Apr 04 '18 at 08:30
  • @NaN I used process substitution as explained in https://stackoverflow.com/questions/692000/how-do-i-write-stderr-to-a-file-while-using-tee-with-a-pipe. Maybe this is causing the issue, but how to let it work without parenthesis? – liv913 Apr 04 '18 at 09:55

1 Answers1

0

Ok, I've found a solution. Using this reviewed function, "Before" and "After" values remain stable and memory usage stops from increasing more and more.

_do_cmd() {
  local LOGFILE_TMP="$LOGFILE.tmp"
  _log " ACTION: $@"
  "$@" 2> >(tee -a $LOGFILE_TMP)
  ret=$?
  if [[ $ret -ne 0 ]]; then
      _log "ERROR: Return code $ret"
      grep -v "frame= " "$LOGFILE_TMP" >> $LOGFILE
      rm -f $LOGFILE_TMP
      exit $ret
  fi
  if [ -f $LOGFILE_TMP ]; then rm $LOGFILE_TMP; fi
}

The issue was probably due to my exec command usage, which left the process substitution running in background.

liv913
  • 117
  • 9