I try running Django on Apache Airflow following this idea of a custom operator. When I use the default logging configuration, i.e. AIRFLOW__LOGGING__REMOTE_LOGGING=False
, everything works as expected.
However, when I now use remote logging with CloudWatch, the DAG seems to be stuck when Django starts to configure logging as invoked by django.setup()
. I can confirm that logging to CloudWatch itself works as I have added a few logging events.
*** Reading remote log from Cloudwatch log_group: job-sentinel-image-layer-log-group-5f303da log_stream: dag_id=clear_tokens/run_id=scheduled__2023-07-03T22_00_00+00_00/task_id=execute_clear_tokens_command/attempt=1.log.
[2023-07-05, 17:23:27 CEST] Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: clear_tokens.execute_clear_tokens_command scheduled__2023-07-03T22:00:00+00:00 [queued]>
[2023-07-05, 17:23:27 CEST] Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: clear_tokens.execute_clear_tokens_command scheduled__2023-07-03T22:00:00+00:00 [queued]>
[2023-07-05, 17:23:27 CEST]
--------------------------------------------------------------------------------
[2023-07-05, 17:23:27 CEST] Starting attempt 1 of 2
[2023-07-05, 17:23:27 CEST]
--------------------------------------------------------------------------------
[2023-07-05, 17:23:27 CEST] Executing <Task(DjangoOperator): execute_clear_tokens_command> on 2023-07-03 22:00:00+00:00
[2023-07-05, 17:23:27 CEST] Started process 24150 to run task
[2023-07-05, 17:23:27 CEST] Running: ['airflow', 'tasks', 'run', 'clear_tokens', 'execute_clear_tokens_command', 'scheduled__2023-07-03T22:00:00+00:00', '--job-id', '3', '--raw', '--subdir', 'DAGS_FOLDER/management_commands/src/clear_tokens.py', '--cfg-path', '/tmp/tmpkmehxuz0']
[2023-07-05, 17:23:27 CEST] Job 3: Subtask execute_clear_tokens_command
[2023-07-05, 17:23:28 CEST] Running <TaskInstance: clear_tokens.execute_clear_tokens_command scheduled__2023-07-03T22:00:00+00:00 [running]> on host 147b23cea447
[2023-07-05, 17:23:28 CEST] Exporting the following env vars:
AIRFLOW_CTX_DAG_ID=clear_tokens
AIRFLOW_CTX_TASK_ID=execute_clear_tokens_command
AIRFLOW_CTX_EXECUTION_DATE=2023-07-03T22:00:00+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2023-07-03T22:00:00+00:00
[2023-07-05, 17:23:28 CEST] We are in pre-execute
[2023-07-05, 17:23:28 CEST] after import of `from django.utils.log import configure_logging` in django.setup()
[2023-07-05, 17:30:06 CEST] State of this instance has been externally set to failed. Terminating instance.
[2023-07-05, 17:30:06 CEST] Sending Signals.SIGTERM to group 24150. PIDs of all processes in the group: [24150]
[2023-07-05, 17:30:06 CEST] Sending the signal Signals.SIGTERM to group 24150
[2023-07-05, 17:30:06 CEST] Received SIGTERM. Terminating subprocesses.
[2023-07-05, 17:30:06 CEST] Process psutil.Process(pid=24150, status='terminated', exitcode=0, started='15:23:27') (24150) terminated with exit code 0
Note the time between the logs. The only option now is to mark the task failed. Do I need to setup something specific in the django logging setting for this to work?