13

Trying to write "hello world" into an airflow log (airflow 1.10.3). Based on the SO solutions presented here and here I should be able to just import logging and logging.info('hello world'). That doesn't seem to work for me.

import logging
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator

default_args = {
    'owner': 'benten',
    'depends_on_past': False,
    'start_date': datetime(2019, 7, 25),
    'email_on_failure': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=1),
    }

def logging_is_fun():
    logging.debug("hellow world")
    logging.info("hello world")
    logging.critical("hello world")
    return None

with DAG('fun_logs', schedule_interval='45 * * * *', default_args=default_args) as dag:
    log_task = PythonOperator(python_callable=logging_is_fun, task_id='log_test_task')

I trigger the dag manually and the task executes with no problems. But alas when I check the logs all I see is this:

*** Reading local file: /home/ubuntu/airflow/logs/fun_logs/log_test_task/2019-08-31T19:22:49.653712+00:00/1.log

Where are my amazing "hello world" statements? I don't expect to see all of them, given my log level setting. I do expect to see the critical message, though.

My airflow.cfg has the following in it (all default settings to the best of my knowledge):

# The folder where airflow should store its log files
# This path must be absolute
base_log_folder = /home/ubuntu/airflow/logs

# Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search.
# Users must supply an Airflow connection id that provides access to the storage
# location. If remote_logging is set to true, see UPDATING.md for additional
# configuration requirements.
remote_logging = False
remote_log_conn_id =
remote_base_log_folder =
encrypt_s3_logs = False

# Logging level
logging_level = WARN
fab_logging_level = WARN

# Logging class
# Specify the class that will specify the logging configuration
# This class has to be on the python classpath
# logging_config_class = my.path.default_local_settings.LOGGING_CONFIG
logging_config_class =

# Log format
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s

# Log filename format
log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = /home/ubuntu/airflow/logs/dag_processor_manager/dag_processor_manager.log
benten
  • 1,995
  • 2
  • 23
  • 38

4 Answers4

8

Set logging_level = INFO instead of WARN in airflow.cfg and you should be able to see your logs.

Reason

logging_level logs when airflow events reach those log levels. For example, an airflow operator getting deprecated will generate an airflow event that gets logged as WARN.

As far as your code is concerned, they are just normal python statements that you want to get logged. So they actually fall under INFO log_level in airflow. So if you set your logging_level to INFO, you should be able to see your logging statements.

Amit Singh
  • 2,875
  • 14
  • 30
  • Works! I was loosing my mind trying to figure this out. Earlier my logs were blowing up in size so I had set things to WARN. I didn't realize the logging-level in my script would be over ridden. – benten Sep 10 '19 at 15:33
  • @Amit Singh I have the same code, instead log level is INFO. I am not getting the debug , logging.debug("hello world"). Is something else that I should enable? – pm1359 May 28 '21 at 08:13
  • @benten I have the same code, instead log level is INFO. I am not getting the debug , logging.debug("hello world"). Is something else that I should enable? – pm1359 May 28 '21 at 08:13
  • not sure Maryam. it's been a long time since i've looked at this. – benten Jun 08 '21 at 15:24
1

you're just missing to run the task. Just add log_task.run() and your logs will be there for you.

Just test it and it worked, here's what the output looks like:

[2019-09-03 02:19:15,990] {main.py:18} CRITICAL - hello world
[2019-09-03 02:19:16,024] {main.py:17} INFO - hello world
[2019-09-03 02:19:16,024] {main.py:18} CRITICAL - hello world
[2019-09-03 02:19:16,060] {main.py:17} INFO - hello world
[2019-09-03 02:19:16,061] {main.py:18} CRITICAL - hello world
[2019-09-03 02:19:16,100] {main.py:17} INFO - hello world
[2019-09-03 02:19:16,100] {main.py:18} CRITICAL - hello world
...
...

Hope it helps.

Haroun Mohammedi
  • 2,404
  • 12
  • 25
  • Cool. You're triggering the job from the UI? I still don't see anything in my own logs when I run this code - even with `log_task.run()` (I've run it with that line both inside and outside the `with` clause. – benten Sep 03 '19 at 14:02
  • Doesn't seem to matter if it's the command line or not. I'm guessing something is wrong in my `airflow.cfg`. – benten Sep 03 '19 at 17:27
  • Did you try to use the default configuration? – Haroun Mohammedi Sep 03 '19 at 21:38
  • @Haroun Mohammedi I have the same code, instead log level is INFO. I am not getting the debug , logging.debug("hello world"). Is something else that I should enable? – pm1359 May 28 '21 at 08:15
  • I'm confused. Where is this log_run.task() coming from? Is it a python function we need to import? – BertC Nov 29 '21 at 14:59
0

Disclaimer: I'm still figuring this out too. But this is my understanding so far.

Airflow has multiple ways of logging. Tasks log to their own files, and not to stdout which the airflow services themselves do.

You will probably find your logs through the web ui:

  1. Go the dag
  2. Click a task instance
  3. Click [View Log]

(verified with your example dag on my machine, using my own conf: above steps show "hello world", but the terminal on stdout does not)

From what I've seen this is the only type of logs that are affected by the logging_level configuration, which – by the way – is INFO by default.

From the docs these logs are stored at {dag_id}/{task_id}/{execution_date}/{try_number}.log

Astronomer also has a guide here. But I haven't gotten around to trying it out.

AdamAL
  • 1,571
  • 2
  • 14
  • 25
0

I'm bumping an old thread here, but if someone is still struggling with this I have another solution. In my case the logging_level was already set to INFO and initializing logger with "airflow.task" did nothing. No logs were showing up. Instead what ended up working was clearing "default_impersonation" setting from airflow.cfg. You only need this setting if you're unable to use the 'run_as_user' option. Reference here https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#default-impersonation

After clearing the setting, logs showed up nicely.

LTJ
  • 1,197
  • 3
  • 16
  • Was this in Composer? I've set my logging to INFO in cloud composer as below but the messages still aren't showing up. – CClarke May 26 '22 at 08:57
  • 1
    No, this case was in local standalone mode. If it helps, you can also try and override the 'default_impersonation' option in Cloud Composer since it's not a blocked configuration option: https://cloud.google.com/composer/docs/composer-2/composer-features#airflow_configuration – LTJ May 27 '22 at 09:27