2

We are evaluating Airflow for scheduling and data pipeline design. However we are not able to find out how to achieve the following two task:

(1) How to change the DAG schedule through the GUI? (2) How to achieve the incremental update when the data source is Oracle or MySQl.

This is what we have tried:

(1) We tried changing the schedule of the DAG in the GUI, but looks like that only changes the schedule of that particular instance. (2) We tried to handle the incremental update programatically by storing the last column value. Is there any other better way of doing incremental update?

Cloud
  • 119
  • 2
  • 13

1 Answers1

3

1) You can't change the DAG schedule in the GUI, you have to do this in python code when you write the DAG

2) How you do incremental updates is entirely up to you, however I would use a combination of Airflow macros https://airflow.apache.org/code.html#macros and SQL files with JINJA templates https://airflow.apache.org/concepts.html#jinja-templating

Might be worth having a look through the Airflow documentation as it sounds like you're not entirely familiar with its concepts.

Simon D
  • 5,730
  • 2
  • 17
  • 31
  • Do we still need to **rename the DAG**, viz. `dag_id` (after updating the `start_date` / `schedule`) as mentioned [here](https://stackoverflow.com/a/38028555/3679900) or have things improved? – y2k-shubham Jul 05 '18 at 18:18
  • Yes I think you still need to do this. – Simon D Jul 05 '18 at 20:55