1

In the following script I'm iterating through four different paths and running a custom command (here named : /usr/bin/mycommand).

I'm trying to restart that command automatically three times over in case it fails by using a while loop based on the command's exit code.

If the command fails despite restarting it three times, it writes the failure notification in an temporary file which the script sends to me via email at the end.

The issue is that if several paths fail I still only get one notification line in my email (the latest path failure) and I can't figure out why.

#!/bin/bash

## Variables
email="recipient@email.com"
localsrv='srv1'
remotesrv='srv2'
emailbody='/tmp/content.txt'
    
paths=( path1 path2
        path3 path4
)

for path in "${paths[@]}" ; do
    xcmd=0
    /usr/bin/mycommand $path
    status=$?
    while [ $status -ne 0 ]
    do
      /bin/echo "Failure. Restarting command for path $path..."
      xcmd=$(( $xcmd + 1 ))
      /bin/echo "Waiting for 60 seconds before retrying..."
      /bin/sleep 60s
      /usr/bin/mycommand $path
      status=$?
      if [ $xcmd -eq 3 ]; then
       /bin/echo "Three failures for path $path. Skipping." >> $emailbody
       break
      fi
    done
done

if [ -s $emailbody ]; then
    /bin/echo "Backup finished" >> $emailbody
        /bin/mail -s "Failure report from server $localsrv" $email < $emailbody
    /bin/rm $emailbody
else
        /bin/printf "Success!\nBackup finished\n" | /bin/mail -s "Success report from server $localsrv" $email
    /bin/rm $emailbody
fi
Jack557
  • 11
  • 1
  • 1
    Please build a [mre] -- the shortest possible code that has the same problem you're asking about when someone else runs it without changes. Code that sends email, requires `/usr/bin/mycommand` to exist, &c has no expectation that anyone else will be able to run it, so nobody else can test their answers. – Charles Duffy Aug 28 '22 at 12:23
  • 1
    BTW, have you considered storing an array of errors so you don't need to use a temporary file at all? Hardcoded temporary file names can be used by attackers -- think about what happens if someone who doesn't have permission to write to your home directory runs `ln -s /home/youruser/.bashrc /tmp/content.txt` before you run your script as yourself, or `ln -s /etc/passwd /tmp/content.txt` before you run it as root. That's part of why good practice is to _always_ use `mktemp` to generate random, unique temporary file names instead of hardcoding. – Charles Duffy Aug 28 '22 at 12:24
  • Another aside, not related to your immediate problem -- you've got some quoting bugs here; consider running your code through http://shellcheck.net/ and fixing what it finds. – Charles Duffy Aug 28 '22 at 12:26
  • 1
    (also, think about looping with `while ! /usr/bin/mycommand "$path" && (( ++xcmd <= 3 )); do` -- see [Why is using $? to see if a command succeeded or not an antipattern?](https://stackoverflow.com/questions/36313216/why-is-testing-to-see-if-a-command-succeeded-or-not-an-anti-pattern)); not needing to set `status` explicitly will make your logic considerably simpler). – Charles Duffy Aug 28 '22 at 12:28
  • 1
    Another thing I'd suggest is logging the contents of `emailbody`. If you can distinguish whether it's not being written to at a point when you expect it; or has extra contents being ignored; or something else that's going on, that'll let you write a more narrow and specific question. Part of building a [mre] is simplifying as much as possible while still having the same issue -- remember, we don't want your real code, we want the shortest possible code that has the same bug. – Charles Duffy Aug 28 '22 at 12:31
  • Your underlying loop structure works as I suppose you intend for me, when I replace `paths` with `1 2 3 4`, `/usr/bin/mycommand` with `/bin/false`, and `emailbody` with a a filename in my home directory. I get an output file with a failure message for each item. – John Bollinger Aug 28 '22 at 12:39
  • Although the code presented is flawed, I am inclined to think that it does not capture the issue that is the subject of the question. I suspect external activity factors in: for example, maybe something else runs every 2-3 minutes and clears `/tmp` or replaces `/tmp/content.txt` with an empty file. – John Bollinger Aug 28 '22 at 12:43
  • To capture all messaging from start to point of failure, you need to move the ">> $emailbody" from the echo statement to adjacent to the outermost "done". That way it will encompass all messaging. I would also add "2>&1" to ensure runtime error messages are also capture in sequence. I would also add some separator comment "################" to highlight the division of messaging from each attempt. Until you get a smooth running execution, I would also only make one attempt and analyze the log messaging for actionable corrective actions on your part in the script. – Eric Marceau Sep 18 '22 at 18:17

0 Answers0