0

I am trying to extract some data from a DB and monitor it's progress, since the tool that pulls the data doesn't appear to have a mechanism for doing so. So, I fork the command chain and then continue to count the lines of the output file until it watch the count of rows expected.

The problem is (I believe) I am forking the command chain properly. And, although indeed data does flow, it isn't forking, because the code that monitors the file isn't executing. What am I missing?

I am tagging this with sqlplus incase someone from that community knows how to do this natively

#!/usr/bin/env zsh
set -e
set -o pipefail
set -x
set -m
function extract() {
  local from="${1:?You must supply a query from date as 'YYYY-MM-DD HH24:MI:SS'}"
  local to="${2:?You must supply a query from date as 'YYYY-MM-DD HH24:MI:SS'}"
  local u="${3:-${OMC_USERNAME:?no db username supplied}}"
  local p="${4:-${OMC_PASSWORD:?No db password supplied}}"

  local workdir records
  workdir="$(mkWorkdir "${from}" "${to}")"
  records=$(sqlplus -s "$u/$p@sredb1_high" @getdata.sql "${from}" "${to}" | tail +2 | tr -d ' ')

  # the first 2 lines are skipped because, apparently, when we do variable substitution, it lets us know
  # on stdout and not stderr
  (sqlplus -s "$u/$p@sredb1_high" @getdata.sql "${from}" "${to}" | tail +2 > "${workdir}/data.psv") &
  local process=$!
  local count="0"
  until [ "${count}" -eq "${records}" ]; do
    count=$(wc -l "${workdir}/data.psv")
    printf '%.2f%%\r' $((count / records))
    sleep 5
  done
  wait $process
}

Output (some execution you see if due to this code being loaded into another script, but I don't believe it's relevant):

./omc.sh extract '2023-05-01 00:00:00' '2023-05-01 23:59:59'
+ [[ 3 == 0 ]]
++ dirname ./omc.sh
+ source ./commands/extract.sh
++ set -e
++ set -o pipefail
++ set -x
++ set -m
+ extract '2023-05-01 00:00:00' '2023-05-01 23:59:59'
+ local 'from=2023-05-01 00:00:00'
+ local 'to=2023-05-01 23:59:59'
+ local u=uuu
+ local p=pppp
+ local workdir records
++ mkWorkdir '2023-05-01 00:00:00' '2023-05-01 23:59:59'
++ local 'from=2023-05-01 00:00:00'
++ local 'to=2023-05-01 23:59:59'
++ isDate '2023-05-01 00:00:00'
++ grep -Eq '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}'
++ isDate '2023-05-01 23:59:59'
++ grep -Eq '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}'
++ local workdir=2023-05-01-00:00:00_2023-05-01-23:59:59
++ test -d 2023-05-01-00:00:00_2023-05-01-23:59:59
++ echo 2023-05-01-00:00:00_2023-05-01-23:59:59
+ workdir=2023-05-01-00:00:00_2023-05-01-23:59:59
++ sqlplus -s uuu/pppp#@sredb1_high @getdata.sql '2023-05-01 00:00:00' '2023-05-01 23:59:59'
++ tail +2
++ tr -d ' '
^C^C^C^C^C^D

I end up killing the process since it's not doing as I expect

Shawn
  • 47,241
  • 3
  • 26
  • 60
Christian Bongiorno
  • 5,150
  • 3
  • 38
  • 76
  • `strace` is your friend. I'm almost certain you'll see on running such a trace that sqlplus isn't closing its stdout. If it doesn't, then of course tail and tr won't either. – Charles Duffy Jul 13 '23 at 18:32
  • (of course, that particular failure mode is one you can check for just as well with lsof or by inspecting procfs). – Charles Duffy Jul 13 '23 at 18:34
  • It's also good form to redirect stdin from `/dev/null` whenever you don't have a _very_ specific reason to leave a program at the head of a pipeline connected to the terminal... and of course, turn off sqlplus silent mode when troubleshooting; back in the day I wrote my own SQL client to use in scripts because sqlplus was so very adamant about being difficult to wrangle) – Charles Duffy Jul 13 '23 at 18:37
  • Consider the advice to redirect stdin doubled, by the way, if this is running in an interpreter with job control enabled. You really don't want a SIGTTIN. – Charles Duffy Jul 13 '23 at 18:40
  • @CharlesDuffy - I don't understand your 3rd and 4th comments. How does one redirect stdin from `/dev/null`? `/dev/null > /dev/stdin` ? – Christian Bongiorno Jul 13 '23 at 23:20
  • 1
    In this case, the stdin redirection will be `command < /dev/null`. See [here](https://stackoverflow.com/a/19956266/20307768). – Constantin Hong Jul 13 '23 at 23:49
  • 1
    (I didn't comment on this myself earlier because Constantin got it right; but to expand -- this would be something like ` "${workdir}/data.psv"` -- mind, the position of ` – Charles Duffy Jul 14 '23 at 15:06
  • BTW, think about adding a `tee` component to the pipeline to get a log you can review that hasn't been through `tail` or `tr`. – Charles Duffy Jul 14 '23 at 15:10
  • From the log, looks like script only got to the first sql call: ` records=$(sqlplus -s "$u/$p@sredb1_high" @getdata.sql "${from}" "${to}" | tail +2 | tr -d ' ')`, and never moved pass that point. If the process take lot of time, does not make sense to run it twice. Try to find a different way to find expected line count. – dash-o Jul 15 '23 at 10:47

0 Answers0