1

I am trying to backfill a job that requires the date to be tuned to the first day of last month.

I could use:

from datetime import datetime, timedelta, date

date = (date.today().replace(day=1) - timedelta(days=1)).replace(day=1)

But I am not sure if the backfill in Airflow will return date.today() as the day of the run.

tobi6
  • 8,033
  • 6
  • 26
  • 41
Richard Chen
  • 45
  • 2
  • 2
  • 5

2 Answers2

4

I got to this which still spits out a timestamp, would need to add some string formatting to get just a date if that's what you need

dag = DAG('test',default_args=default_args, schedule_interval='@daily')

def test_print(input):
    print('Testing')
    print(input)

test_print = PythonOperator(
        task_id='test_print',
        python_callable=test_print,
        op_args=['{{ execution_date + macros.dateutil.relativedelta.relativedelta(months=-1, day=1) }}'],
        dag=dag
)

Note the mix of relative and absolute arguments to relativedelta

year, month, day, hour, minute, second, microsecond:Absolute information (argument is singular); adding or subtracting a relativedelta with absolute information does not perform an arithmetic operation, but rather REPLACES the corresponding value in the original datetime with the value(s) in relativedelta.

raphael
  • 2,762
  • 5
  • 26
  • 55
2

Airflow has no influence on the date.today() function.

In fact, if you use this approach you remove one of Airflows greatest functions - restartability at any given date.

There is no macro that I know of to get the first day of last month. You could put your calculation function into a small function though - and use context['taskinstance']['execution_date'] and not date.today(). See more macros here https://airflow.apache.org/code.html#macros

When you have a small function which returns the wanted value, you can add it as your own macro. See more on that in this question: Make custom Airflow macros expand other macros

EDIT

You have tried this:

LAST_MONTH = '{{ (execution_date.replace(day=1) - macros.timedelta(days=1)).replace(day=1) }}'

It is not possible to use standard Python functions within Jinja template strings. Again, I suggest you build a function with one parameter, date, which returns a date as you need it. Then add this function to the available macros with the DAG property user_defined_macros and use this function like LAST_MONTH = {{ my_date_function_which_gives_my_needed_date(execution_date) }}

More, as stated already, in both links, which also have step-by-step help.

tobi6
  • 8,033
  • 6
  • 26
  • 41
  • I put it as: LAST_MONTH = '{{ (execution_date.replace(day=1) - macros.timedelta(days=1)).replace(day=1) }}' But it seem to give me an error... – Richard Chen Jul 30 '18 at 19:20
  • Please edit your question and add the error. It is not possible to help if the only info we have is *seems to give me an error*. – tobi6 Jul 31 '18 at 07:34
  • Why do you fill a variable? Please edit your question and add the DAG code. Which kind of operator do you use? – tobi6 Jul 31 '18 at 07:39