1

I have a DAG that inserts data into a SQL Server database. Some of the tasks take 24+ hours to run as the database its inserting into is not high performing.

I need to mark the tasks as complete automatically if they take more than 24 hours to run, as I need to move on from them so I can start inserting the next days worth of data (the DAG runs daily and the data source has new data coming in every day). How can I do this programmatically, where I don't have to go into the UI to mark it as 'Success' or 'Failed'?

cluis92
  • 664
  • 12
  • 35
  • which version of airflow are you using? – drum Jul 12 '21 at 03:33
  • 1
    @cluis92 you can make a sql query directly into Airflow's metadata DB: https://stackoverflow.com/questions/40315171/airflow-mark-a-specific-task-id-of-given-dag-id-and-run-id-as-success-or-failu – lionbigcat Jul 12 '21 at 03:37

1 Answers1

-1

You could follow a similar approach as shown in this StackOverflow post: kill or terminate subprocess when timeout. Then once the timeout occurs, you just need to make sure you don't raise any Exception.

Jorrick Sleijster
  • 935
  • 1
  • 9
  • 22
  • I don't see how this addresses the OP's question, which is, how do you mark a task in airflow as successful, programatically. – lionbigcat Jul 13 '21 at 09:19