I have a requirement that I want to schedule an airflow job every alternate Friday. However, the problem is I am not able to figure out how to write a schedule for this.
I don't want to have multiple jobs for this.
I tried this
'0 0 1-7,15-21 * 5
However it's not working it's running from 1 to 7 and 15 to 21 everyday.
from shubham's answer I realize that we can have a PythonOperator which can skip the task for us. An I tried to implement the solution. However doesn't seem to work.
As testing this on 2 week period would be too difficult. This is what I did.
- I schedule the DAG to run every 5 mins
- However, I am writing python operator the skip althernate task (pretty similar to what I am trying to do, alternate friday).
DAG:
args = {
'owner': 'Gaurang Shah',
'retries': 0,
'start_date':airflow.utils.dates.days_ago(1),
}
dag = DAG(
dag_id='test_dag',
default_args=args,
catchup=False,
schedule_interval='*/5 * * * *',
max_active_runs=1
)
dummy_op = DummyOperator(task_id='dummy', dag=dag)
def _check_date(execution_date, **context):
min_date = datetime.now() - relativedelta(minutes=10)
print(context)
print(context.get("prev_execution_date"))
print(execution_date)
print(datetime.now())
print(min_date)
if execution_date > min_date:
raise AirflowSkipException(f"No data available on this execution_date ({execution_date}).")
check_date = PythonOperator(
task_id="check_if_min_date",
python_callable=_check_date,
provide_context=True,
dag=dag,
)