I am still very new to airflow and am trying to make a workflow that only processes data on business days. The flow logic should be:
if date.is_busday():
process_data()
else:
do_nothing() # but mark the run as successful or skipped
I used the BranchPythonOperator but did not join the two branches at the end. If it is not a business day, I have it run a dummy operator branch. Otherwise, it goes on to a normal operator branch. Here is what the flow looks like: DAG diagram
I am able to successfully skip non-business days using this method. However, I get this warning: WARNING - No DAG RUN present this should not happen'
Additionally, when I set SLA, I started getting emails for all the dummy tasks that were not run.
When I started reading about branch operators, all the examples eventually joined the branches (branch documentation, ex. 1, ex. 2, ex. 3). This question is the closest I found to mine but seemed to be solved by joining the branches: similar question This question doesn't join the branches but their use case seems unique and complicated ex.4
Since I get warnings and cannot find any examples out there of un-joined branches, I am now doubting that my approach. My questions are:
- What is the best/simplest way to make this work in airflow?
- If my approach is correct, how do I get rid of the warning message or is it just something I have to live with?
- If I had to join the branches, could I join the load_data branch and the dummy branch with another dummy branch? This seems kind of absurd to me but could it be the right way to avoid all the warnings?
Thanks in advance.