1

by using shell scripting, I am dividing one long data file into 8 files and run them in parallel in 8 instance.

function_child()
{
while read -r record
do
 ###process to get the data by arsdoc get##
    exit 12  ## if get fails##
 ### fp2pdf ###
    EXIT 12  ## IF AFP2PDF FAILS ###
 ### logic ###
    exit 12  ## if logic fails####
done < $1
}

## main ##
for file in /$MY_WORK/CCN_split_files/*; do
   proceed_task "$file" &
   PID="$!"
   echo "$PID:$file" | tee $tmp_file
   PID_LIST+="$PID "
done

how can take\ monitor the exit code and pid's of the child process when there is an failure. I tryed this below, Once all the process are sent to background, I am using ‘wait’ function to wait for the PID from our PID_LIST to exit and then capture and print the respective exit status.

for process in "${PID_LIST[@]}";do
   wait "$process"
   exit_status=$?
   file_name=`egrep $process $tmp_file | awk -F ":" '{print $2}' | rev | awk -F "/" '{print $2}' | rev`
   echo "$file_name exit status: $exit_status"
done

but it gives an error

 line 49: wait: `23043 23049 ': not a pid or valid job spec
grep: 23049: No such file or directory

could someone help me on this, Thank you.

  • 2
    Consider using **GNU Parallel** rather than re-inventing the wheel. It will divide your file up for you, tag the output, run it across multiple machines in your network and do error-handling... https://stackoverflow.com/a/59951897/2836621 – Mark Setchell Feb 03 '20 at 10:35
  • @Mark, thabk you. But I am not looking for a GNU , looking for Linux code to solve. – sai prudhvi Feb 03 '20 at 11:13
  • Possible duplicate of [When to wrap quotes around a shell variable?](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Feb 03 '20 at 11:36
  • Don't you want just `xargs -P8 -n1 arsdoc`? Do you call arsdoc for each line? You showed code with `pids[$pid]=$file` and now you show code with `"${PID_LIST[@]}"`, these codes are unrelated. – KamilCuk Feb 03 '20 at 11:38
  • GNU and Linux go hand-in-hand... not sure I understand your comment, but good luck with your project. – Mark Setchell Feb 03 '20 at 12:10
  • @KamilCuk, appologies , updated the code. Thank you. – sai prudhvi Feb 03 '20 at 12:19

4 Answers4

0

See: help jobs and help wait

Collect return status at end of your code

for pid in $(jobs -rp); do
  printf "Job %d handling file %q is still running\n" "$pid" "${pids[pid]}"
done

for pid in ${jobs -sp); do
  printf "Job %s handling file %q has returned with status %d\n" "$pid" "${pids[pid]}" "$(wait "$pid")"
done
Léa Gris
  • 17,497
  • 4
  • 32
  • 41
0

The double quotes around the argument to wait creates a single string. Remove the quotes to have the shell break up the string into individual PIDs.

tripleee
  • 175,061
  • 34
  • 275
  • 318
0

Use wait on proper pid numbers.

function_child() {
    while read -r record; do
        # let's return a random number!
        exit ${RANDOM}
    done <<<'a'
}

mkdir -p my-home/dir
touch my-home/dir/{1..9}

for file in my-home/dir/*; do
    function_child "$file" &
    pid=$!
    echo "Backgrounded: $file (pid=$pid)"
    pids[$pid]=$file
done

for i in "${!pids[@]}"; do
    wait "$i"
    ret=$?
    echo ${pids[$i]} returned with $ret
done

outputs on repl:

Backgrounded: my-home/dir/1 (pid=84)
Backgrounded: my-home/dir/2 (pid=85)
Backgrounded: my-home/dir/3 (pid=86)
Backgrounded: my-home/dir/4 (pid=87)
Backgrounded: my-home/dir/5 (pid=88)
Backgrounded: my-home/dir/6 (pid=89)
Backgrounded: my-home/dir/7 (pid=90)
Backgrounded: my-home/dir/8 (pid=91)
Backgrounded: my-home/dir/9 (pid=92)
my-home/dir/1 returned with 241
my-home/dir/2 returned with 59
my-home/dir/3 returned with 235
my-home/dir/4 returned with 11
my-home/dir/5 returned with 6
my-home/dir/6 returned with 222
my-home/dir/7 returned with 230
my-home/dir/8 returned with 189
my-home/dir/9 returned with 195

But I think just use xargs or other tool designed to run such jobs in parallel.

 printf "%s\n" my-home/dir/* | xargs -n$'\n' -P8 sh -c 'echo "$1"; ###process to get the data by arsdoc get' --

@KamilCuk, appologies , updated the code.

The PID_LIST+="$PID " creates a one long variable with spaces. The "${PID_LIST[@]}" is an expansion used for arrays. It works that way, that ${PID_LIST[@]} just expands to the value of the variable PID_LIST, as if "$PID_LIST", so it expands to "23043 23049 ". Because it is in quotes it iterates over one element, hence it runs wait "23043 23049 ", hence you see the error message.

Not recommended: You could depend on shell space splitting

for process in $PID_LIST; do
     wait "$process"

But just use an array:

    PID_LIST+=("$PID")
done

for process in "${PID_LIST[@]}"; do
    wait "$process"

If you feel not safe with your pids[$pid]=$file associative array, use two arrays instead:

     onlypids+=("$pid")
     files+=("$files")
done

for i in "${!onlypids[@]}"; do
     pid="${onlypids[$i]}"
     file="${files[$i]}"
     wait "$pid"

Note that by convention, upper case variable names are for exported variables.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 1
    Before the loop on PID indexes, `wait "${!pids[@]}"` to wait all child to complete. Then you can loop over each individual PID to recover its return status. – Léa Gris Feb 03 '20 at 12:36
0

You mention in the comments that you do not want to use GNU Parallel, so this answer is for people who do not have that restriction.

doit()  {
  record="$1"
  ###process to get the data by arsdoc get##
     exit 12  ## if get fails##
  ### fp2pdf ###
     EXIT 12  ## IF AFP2PDF FAILS ###
  ### logic ###
     exit 12  ## if logic fails####
}
export -f doit

cat /$MY_WORK/CCN_split_files/* |
  parallel --joblog my.log doit
# Field 7 of my.log is the exit value

# If you have an unsplit version of the input you can have GNU Parallel process it:
# cat /$MY_WORK/CNN_big_file |
#   parallel --joblog my.log doit


Ole Tange
  • 31,768
  • 5
  • 86
  • 104