2

I have to

  • update a table Foo monthly
  • and another table Bar daily
  • and join these two tables daily and insert the result into a third table Bazz

Is it possible to configure that

  • Foo is updated on certain day (say 5th),
  • while Bar is updated daily
  • and they are in the same DAG?
y2k-shubham
  • 10,183
  • 11
  • 55
  • 131
Shengxin Huang
  • 647
  • 1
  • 11
  • 25
  • Might be better to have them in separate dags, but you could have Foo check what day it is and if it isn't the 5th, it does nothing, while Bar and Bazz run. – chris.mclennon Jul 19 '19 at 02:03

1 Answers1

3

This behaviour can be achieved within single DAG using either of following alternatives

Basically, your DAG would still run each day (schedule_interval='@daily'), but

  • on a daily basis, only your Bar task would run while Foo would get skipped (or short-circuited);
  • until on some particular day (like 5th of each month) when both would run.

You can, of course, also model these as separate DAGs and chain them together (rather than individual tasks within a single DAG). This choice might be better as long as the number of DAGs that you are linking together is small.


Related: Schedule airflow job bi-weekly

y2k-shubham
  • 10,183
  • 11
  • 55
  • 131