33

In my first foray into airflow, I am trying to run one of the example DAGS that comes with the installation. This is v.1.8.0. Here are my steps:

$ airflow trigger_dag example_bash_operator
[2017-04-19 15:32:38,391] {__init__.py:57} INFO - Using executor SequentialExecutor
[2017-04-19 15:32:38,676] {models.py:167} INFO - Filling up the DagBag from /Users/gbenison/software/kludge/airflow/dags
[2017-04-19 15:32:38,947] {cli.py:185} INFO - Created <DagRun example_bash_operator @ 2017-04-19 15:32:38: manual__2017-04-19T15:32:38, externally triggered: True>
$ airflow dag_state example_bash_operator '2017-04-19 15:32:38'
[2017-04-19 15:33:12,918] {__init__.py:57} INFO - Using executor SequentialExecutor
[2017-04-19 15:33:13,229] {models.py:167} INFO - Filling up the DagBag from /Users/gbenison/software/kludge/airflow/dags
running

The dag state remains "running" for a long time (at least 20 minutes by now), although from a quick inspection of this task it should take a matter of seconds. How can I troubleshoot this? How can I see which step it is stuck on?

gcbenison
  • 11,723
  • 4
  • 44
  • 82
  • Can you share your code ?it would easier to answer if we know on what you are doing – Aravind Krishnakumar Apr 19 '17 at 22:41
  • 1
    I haven't added any code beyond what is provided with the v.1.8.0 installation. – gcbenison Apr 19 '17 at 22:54
  • Oh! check schedule_interval and start_date ,if the date and time is scheduled in future, the script will trigger only at that particular date and time – Aravind Krishnakumar Apr 19 '17 at 23:08
  • I think you are referring to this example Dag here:- https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_bash_operator.py In that case, tthere is some issue with the tasks config and the dag_run timeout is 60 mins so it will continue to execute. GO to the logs directory and post the log for this dag run to get further insight – Priyank Mehta Apr 20 '17 at 05:53
  • 2
    Did you ever resolve this? – RobGThai Jun 30 '17 at 09:25
  • 1
    I never resolved the issue with this example DAG specifically, but I've since gone on to use airflow for my own purposes so for me the broader issue of needing to get started with airflow is resolved. – gcbenison Jul 06 '17 at 02:52

4 Answers4

57

To run any DAGs, you need to make sure two processes are running:

  • airflow webserver
  • airflow scheduler

If you only have airflow webserver running, the UI will show DAGs as running, but if you click on the DAG, none of it's tasks are actually running or scheduled, but rather in a Null state. What this means is that they are waiting to be picked up by airflow scheduler. If airflow scheduler is not running, you'll be stuck in this state forever, as the tasks are never picked up for execution.

Additionally, make sure that the toggle button in the DAGs view is switched to 'ON' for the particular DAG. Otherwise it will not get picked up by the scheduler if you trigger it manually.

chhantyal
  • 11,874
  • 7
  • 51
  • 77
Ladislav Indra
  • 819
  • 6
  • 10
20

I too recently started using Airflow and my dags kept endlessly running. Your dag may be set on 'pause' without you realizing it, and thus the scheduler will not schedule new task instances and when you trigger the dag it just looks like it is endlessly running.

There are a few solutions:

1) In the Airflow UI toggle the button left of the dag from 'Off' to 'On'. Off means that the dag is paused, so On will allow the scheduler to pick it up and complete the dag. (this fixed my initial issue)

2) In your airflow.cfg file dags_are_paused_at_creation = True, is the default. So all new dags you create are paused from the start. Change this to False, and future dags you create will be good to go right away (i had to reboot webserver and scheduler for changes to the airflow.cfg to be recognized)

3) use the command line $ airflow unpause [dag_id] documentation: https://airflow.apache.org/cli.html#unpause

jbreezybaby
  • 209
  • 2
  • 4
0

The below worked for me.

  1. Make sure AIRFLOW_HOME is set
  2. in AIRFLOW_HOME have folders dags, plugins. The folders to have permissions r,w,x to airflow user.
  3. Make sure u have atleast one dag in the dags/ folder.
  4. pip install celery[redis]==4.1.1

I have checked the above soln on airflow 1.9.0 Airflow version

I tried the same trick with airflow 1.10 version and it worked.

chendu
  • 684
  • 9
  • 21
0

I faced the issue at sensor task,it is in infinite running state not moving further, to resolve this i had instantly marked the task failed and changed the cluster name in the code and retriggered the dag. you can do this in another way just fail the current task wait for a minute and retrigger the dag

raja
  • 1