Questions tagged [airflow-scheduler]

The Apache Airflow scheduler monitors all tasks and all DAGs, and triggers the task instances whose dependencies have been met, and Apache Airflow is a platform to programmatically author, schedule and monitor workflows.

1257 questions
65
votes
15 answers

Airflow 1.9.0 is queuing but not launching tasks

Airflow is randomly not running queued tasks some tasks dont even get queued status. I keep seeing below in the scheduler logs [2018-02-28 02:24:58,780] {jobs.py:1077} INFO - No tasks to consider for execution. I do see tasks in database that…
l0n3r4n83r
  • 1,271
  • 1
  • 14
  • 25
55
votes
3 answers

Efficient way to deploy dag files on airflow

Are there any best practices that are followed for deploying new dags to airflow? I saw a couple of comments on the google forum stating that the dags are saved inside a GIT repository and the same is synced periodically to the local location in…
Sreenath Kamath
  • 663
  • 1
  • 7
  • 17
35
votes
6 answers

Airflow tasks get stuck at "queued" status and never gets running

I'm using Airflow v1.8.1 and run all components (worker, web, flower, scheduler) on kubernetes & Docker. I use Celery Executor with Redis and my tasks are looks like: (start) -> (do_work_for_product1) ├ -> (do_work_for_product2) ├ ->…
Norio Akagi
  • 705
  • 1
  • 8
  • 22
33
votes
2 answers

How to define Airflow DAG/task that shouldn't run periodically

The goal is pretty simple: I need to create a DAG for a manual task that should not run periodically, but only when admin presses the "Run" button. Ideally without a need to switch "unpause" and "pause" the DAG (you know someone will surely forget…
Ikar Pohorský
  • 4,617
  • 6
  • 39
  • 56
28
votes
2 answers

Airflow S3KeySensor - How to make it continue running

With the help of this Stackoverflow post I just made a program (the one shown in the post) where when a file is placed inside an S3 bucket a task in one of my running DAGs is triggered and then I perform some work using the BashOperator. Once it's…
Kyle Bridenstine
  • 6,055
  • 11
  • 62
  • 100
27
votes
5 answers

Airflow: Creating a DAG in airflow via UI

Airflow veterans please help, I was looking for a cron replacement and came across apache airflow. We have a setup where multiple users should be able to create their own DAGs and schedule their jobs. Our users are a mix of people who may not know…
Mukul Jain
  • 1,807
  • 9
  • 26
  • 38
27
votes
2 answers

Airflow scheduler is slow to schedule subsequent tasks

When I try to run a DAG in Airflow 1.8.0 I find that it takes a lot of time between the time of completion predecessor task and the time at which the successor task is picked up for execution (usually greater the execution times of individual…
Prasann
  • 460
  • 1
  • 5
  • 13
25
votes
3 answers

How to Trigger a DAG on the success of a another DAG in Airflow using Python?

I have a python DAG Parent Job and DAG Child Job. The tasks in the Child Job should be triggered on the successful completion of the Parent Job tasks which are run daily. How can add external job trigger ? MY CODE from datetime import datetime,…
25
votes
1 answer

Which one to choose Apache Oozie or Apache Airflow? Need a comparison

I am new to job schedulers and was looking out for one to run jobs on big data cluster. I was quite confused with the available choices. Found Oozie to have many limitations as compared to the already existing ones such as TWS, Autosys, etc. Need…
Vishal786btc
  • 428
  • 1
  • 5
  • 17
23
votes
2 answers

Airflow worker stuck : Task is in the 'running' state which is not a valid state for execution. The task must be cleared in order to be run

Airflow tasks run w/o any issues and suddenly half the way it gets stuck and the task instance details say above message. I cleared my entire database, but still, I am getting the same error. The fact is I am getting this issue for only some dags.…
joss
  • 695
  • 1
  • 5
  • 16
22
votes
3 answers

Airflow dynamic tasks at runtime

Other questions about 'dynamic tasks' seem to address dynamic construction of a DAG at schedule or design time. I'm interested in dynamically adding tasks to a DAG during execution. from airflow import DAG from airflow.operators.dummy_operator…
Kirk Broadhurst
  • 27,836
  • 16
  • 104
  • 169
21
votes
2 answers

Airflow webserver gives cron error for dags with None as schedule interval

I'm running Airflow 1.9.0 with LocalExecutor and PostgreSQL database in a Linux AMI. I want to manually trigger DAGs, but whenever I create a DAG that has schedule_interval set to None or to @once, the webserver tree view crashes with the following…
T. van Hees
  • 211
  • 1
  • 2
  • 5
21
votes
4 answers

Remove Airflow Scheduler logs

I am using Docker Apache airflow VERSION 1.9.0-2 (https://github.com/puckel/docker-airflow). The scheduler produces a significant amount of logs, and the filesystem will quickly run out of space, so I am trying to programmatically delete the…
Ryan Stack
  • 1,231
  • 1
  • 12
  • 25
19
votes
4 answers

Airflow giving log file does not exist error while running on Docker

The scheduler and the webserver are being run on different containers and when I run a DAG and check the logs on the webserver, it shows me this particular error. *** Log file does not exist:…
isht3
  • 333
  • 1
  • 6
  • 11
19
votes
5 answers

How to delete XCOM objects once the DAG finishes its run in Airflow

I have a huge json file in the XCOM which later I do not need once the dag execution is finished, but I still see the Xcom Object in the UI with all the data, Is there any way to delete the XCOM programmatically once the DAG run is finished. Thank…
vijay krishna
  • 263
  • 1
  • 3
  • 14
1
2 3
83 84