0

I have the following log in apache airflow GUI:

*** Reading local file: /home/ubuntu/airflow/logs/risk_position/delta_risk/2020-11-25T06:38:38.444673+00:00/1.log
[2020-11-25 06:52:40,950] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: risk_position.delta_risk 2020-11-25T06:38:38.444673+00:00 [queued]>
[2020-11-25 06:52:40,964] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: risk_position.delta_risk 2020-11-25T06:38:38.444673+00:00 [queued]>
[2020-11-25 06:52:40,965] {taskinstance.py:880} INFO - 
--------------------------------------------------------------------------------
[2020-11-25 06:52:40,965] {taskinstance.py:881} INFO - Starting attempt 1 of 6
[2020-11-25 06:52:40,965] {taskinstance.py:882} INFO - 
--------------------------------------------------------------------------------
[2020-11-25 06:52:40,974] {taskinstance.py:901} INFO - Executing <Task(BashOperator): delta_risk> on 2020-11-25T06:38:38.444673+00:00
[2020-11-25 06:52:40,977] {standard_task_runner.py:54} INFO - Started process 18650 to run task
[2020-11-25 06:52:41,002] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'risk_position', 'delta_risk', '2020-11-25T06:38:38.444673+00:00', '--job_id', '64', '--pool', 'default_pool', '--raw', '-sd', '/home/ubuntu/.local/lib/python3.6/site-packages/airflow/example_dags/risk_position.py', '--cfg_path', '/tmp/tmp1kqxd_yj']
[2020-11-25 06:52:41,003] {standard_task_runner.py:78} INFO - Job 64: Subtask delta_risk
[2020-11-25 06:52:41,024] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: risk_position.delta_risk 2020-11-25T06:38:38.444673+00:00 [running]> ip-************.ap-northeast-1.compute.internal
[2020-11-25 06:52:41,035] {bash_operator.py:113} INFO - Tmp dir root location: 
 /tmp
[2020-11-25 06:52:41,036] {bash_operator.py:136} INFO - Temporary script location: /tmp/airflowtmpowss08ak/delta_riskhmnyrm0e
[2020-11-25 06:52:41,036] {bash_operator.py:146} INFO - Running command: /home/ubuntu/extra/cronjobs/unify_report.sh delta_risk || exit 1 
[2020-11-25 06:52:41,042] {bash_operator.py:153} INFO - Output:
[2020-11-25 06:52:41,044] {bash_operator.py:157} INFO - delta_risk report
[2020-11-25 06:52:41,046] {bash_operator.py:157} INFO - output  /home/ubuntu/market_risk/delta_risk/output 2020-11-24delta_risk.xlsx
[2020-11-25 06:52:41,046] {bash_operator.py:157} INFO - exists /home/ubuntu/extra/cronjobs/unify_reports
[2020-11-25 06:52:41,971] {bash_operator.py:161} INFO - Command exited with return code 0
[2020-11-25 06:52:41,976] {taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=risk_position, task_id=delta_risk, execution_date=20201125T063838, start_date=20201125T065240, end_date=20201125T065241
[2020-11-25 06:52:45,928] {local_task_job.py:102} INFO - Task exited with return code 0

As this demonstrate, there is a command in bash Running command: /home/ubuntu/extra/cronjobs/unify_report.sh delta_risk || exit 1 which run the delta_risk.py file. This same .py file generate a traceback error and does not finish properly. I have put the || exit 1 hoping that this will also send an error from the bash script which will make the process to fail in AIrflow GUI. But the process finish with success in Airflow. I want it to fail along with the python file when there is a traceback error.

How is it possible please?

EDIT:

below is the part of the .sh which run the python file and create a log file:

(
    # run each code to generate files
    cd "${SOURCE_PATH_base}/${SUB_PATH}"
    python3 $REPORT_TYPE.py || exit 1
    # output file link
    dir=$top_dir/output_$REPORT_TYPE
    echo $dir 'output link'
    if [ -e $dir ]; then
        echo 'exists' $dir
    else
        # create link to real path
        echo 'no output link' $dir
        echo 'real output path' ${OUTPATH}
        cd ${TOPDIR}/${JOBDIR}
        trap `ln -s "${OUTPATH}" "output_${REPORT_TYPE}"` 1 2 3 15
        echo 'created output link' $dir
    fi
    # upload to gdrive
    cd "${SOURCE_PATH_base}/${INTERAPI_PATH}"
    python3 teamdrive_control.py Market_Risk $REPORT_TYPE "${OUTPATH}/${OUT}" "${TO}" "${FILENO_FLAG}" "${SUBJECT}"
) 2>&1 | xz -9ec > logs/${JOBDIR}-$(date +%s)_$REPORT_TYPE.log.xz
tripleee
  • 175,061
  • 34
  • 275
  • 318
delalma
  • 838
  • 3
  • 12
  • 24
  • Probably the script masks the error. We can't see its source, so we can't tell you how to fix it. A properly written shell script will propagate the failure to its caller (an improperly written one might simply `exit 0` at the end regardless of what happened before). – tripleee Nov 25 '20 at 08:35
  • @tripleee thanks for your comment. Do you think that creation of a log as I do above in the sh file could "mask" as you mentioned? – delalma Nov 25 '20 at 08:49
  • Yes, the exit status from the pipeline will be the exit status from the last command in the pipeline. – tripleee Nov 25 '20 at 08:59

1 Answers1

1

Indeed, the pipeline in your script will mask any error from within the subshell.

I'm guessing you are using a subshell just to get the output redirection to apply to the entire command. A better design for that is to put the code in a function, which can also then perform a non-local exit in the case of a fatal error.

fun () {
    # run each code to generate files
    cd "${SOURCE_PATH_base}/${SUB_PATH}"
    python3 "$REPORT_TYPE.py" || exit 1
    # output file link
    dir=$top_dir/output_$REPORT_TYPE
    echo "$dir output link"
    if [ -e "$dir" ]; then
        echo "exists $dir"
    else
        # create link to real path
        echo "no output link $dir"
        echo "real output path ${OUTPATH}"
        cd "${TOPDIR}/${JOBDIR}"
        # This is really weird; did you mean to put single quotes?
        trap `ln -s "${OUTPATH}" "output_${REPORT_TYPE}"` 1 2 3 15
        echo "created output link $dir"
    fi
    # upload to gdrive
    cd "${SOURCE_PATH_base}/${INTERAPI_PATH}"
    python3 teamdrive_control.py Market_Risk $REPORT_TYPE "${OUTPATH}/${OUT}" "${TO}" "${FILENO_FLAG}" "${SUBJECT}" || exit
}

fun 2>&1 | xz -9ec > "logs/${JOBDIR}-$(date +%s)_$REPORT_TYPE.log.xz"

The trap looks really wrong; did you mean to use single quotes there? The current code will attempt to run ln -s immediately and run its output as the trap.

Notice also the various quoting fixes; you had generally quoted everything which didn't need to be quoted, and left the stuff which absolutely needs quoting outside quotes. Perhaps see also When to wrap quotes around a shell variable; probably also get into the habit of running http://shellcheck.net/ on your code.

It doesn't matter much here, because you redirect everything to the same file anyway, but diagnostic messages should generally be redirected to standard error. A good practice is also to include the name of the script which printed the diagnostic, so you can see where it comes from even when you have scripts calling scripts calling scripts etc.

echo "$0: values of β will lead to dom" >&2
tripleee
  • 175,061
  • 34
  • 275
  • 318