40

I am new to Airflow. I am following a tutorial and written following code.

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from models.correctness_prediction import CorrectnessPrediction

default_args = {
    'owner': 'abc',
    'depends_on_past': False,
    'start_date': datetime.now(),
    'email': ['abc@xyz.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}

def correctness_prediction(arg):
    CorrectnessPrediction.train()

dag = DAG('daily_processing', default_args=default_args)

task_1 = PythonOperator(
    task_id='print_the_context',
    provide_context=True,
    python_callable=correctness_prediction,
    dag=dag)

On running the script, it doesn't show any errors but when I check for dags in Web-UI it doesn't show under Menu->DAGs

enter image description here

But I can see the scheduled job under Menu->Browse->Jobs

enter image description here

I also cannot see anything in $AIRFLOW_HOME/dags. Is it supposed to be like this only? Can someone explain why?

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
Rusty
  • 1,086
  • 2
  • 13
  • 27
  • Create a subdirectory called `dags` in your main project directory and move your DAG there. Then refresh the Airflow UI and you should be able to see it. Note that the `AIRFLOW_HOME` should be set to be your main project directory. – tsveti_iko May 31 '22 at 11:23
  • You should not use datetime.now() to schedule – Phd. Burak Öztürk Nov 05 '22 at 08:29

15 Answers15

31

Run airflow dags list (or airflow list_dags for Airflow 1.x) to check, whether the dag file is located correctly.

For some reason, I didn't see my dag in the browser UI before I executed this. Must be issue with browser cache or something.

If that doesn't work, you should just restart the webserver with airflow webserver -p 8080 -D

degenerate
  • 1,224
  • 1
  • 14
  • 35
samu
  • 1,936
  • 4
  • 22
  • 26
  • Do you know how to fix the browser UI problem? – Eric Bellet Aug 27 '19 at 12:48
  • @EricBellet for me `airflow list_dags` helped as quick fix, I don't know the root cause for this – samu Aug 27 '19 at 13:29
  • 3
    Yes. Restarting the UI with airflow webserver -p 8080 -D it is other quick fix – Eric Bellet Aug 27 '19 at 13:57
  • 3
    Sometimes even this takes a while to work. I had an experience just now where I followed all of the instructions in this answer, but it still took about 3 minutes for the new DAG to show up in the UI. At some point maybe I'll dig into the configuration settings to see if this is a refresh frequency that can be tweaked. – Stephen Jan 07 '20 at 19:50
  • I had a DAG that was throwing an error, but rather than the error propagating to the UI, the DAG just wouldn't show up. Running `airflow list_dags` allowed me to see the error and debug that way. I am using an older version of Airflow. – ChristopherTull Oct 26 '20 at 18:10
  • 16
    For Airflow 2, try `airflow dags list` – Requin Jun 10 '21 at 19:50
24

I have the same issue. To resolve I need to run scheduler

airflow scheduler

Without this command, I don't see my new DAGs BTW: the UI show me warning related to that problem:

The scheduler does not appear to be running. Last heartbeat was received 9 seconds ago. The DAGs list may not update, and new tasks will not be scheduled.

DenisOgr
  • 531
  • 2
  • 7
  • 13
23

We need to clarify several things:

  1. By no means you need to run the DAG file yourself (unless you're testing it for syntax errors). This is the job of Scheduler/Executor.
  2. For DAG file to be visible by Scheduler (and consequently, Webserver), you need to add it to dags_folder (specified in airflow.cfg. By default it's $AIRFLOW_HOME/dags subfolder).

Airflow Scheduler checks dags_folder for new DAG files every 5 minutes by default (governed by dag_dir_list_interval in airflow.cfg). So if you just added a new file, you have two options:

  1. Restart Scheduler
  2. Wait until current Scheduler process picks up new DAGs.
ptyshevs
  • 1,602
  • 11
  • 26
  • 1
    Ah for me, that was it -- i didn't have the scheduler running to pick up new dags. thanks! – Doug F May 05 '20 at 02:06
13

The ScheduleJob that you see on the jobs page is an entry for the Scheduler. Thats not the dag being scheduled.

Its weird that your $AIRFLOW_HOME/dags is empty. All dags must live within the $AIRFLOW_HOME/dags directory (specifically in the dags directory configured in your airflow.cfg file). Looks like you are not storing the actual dag in the right directory (the dags directory).

Alternatively, sometimes you also need to restart the webserver for the dag to show up (though that doesn't seem to be the issue here).

Vineet Goel
  • 2,138
  • 1
  • 22
  • 28
  • 1
    Do I need to run the script _mentioned in the question_ in $AIRFLOW_HOME/dags folder ? – Rusty Aug 17 '16 at 20:08
  • Yes, thats right. All your dag definitions (python files initialize dags - the line `dag = DAG(...)` in your example above) should be in the global scope within the DAGs dir configured in your airflow.cfg file. – Vineet Goel Aug 17 '16 at 22:40
8

Check the dags_folder variable in airflow.cfg. If you have a virtual environment then run the command export AIRFLOW_HOME=$(pwd) from the main project directory. Note that running export AIRFLOW_HOME=$(pwd) expects your dags to be in a dags subdirectory in the project directory.

deerishi
  • 527
  • 7
  • 5
  • Doing `export AIRFLOW_HOME=/absolute/path/to/airflow` and then `airflow dags list-import-errors` showed me a python syntax error that was keeping my dag from showing up in the list, on airflow v2.4.0 – Nicholas Hansen-Feruch Oct 04 '22 at 01:21
2

I just ran into the same problem. Airflow suggested me to use the following command to evaluate my dag:

Error: Failed to load all files. For details, run `airflow dags list-import-errors`

It was just a comma in my way :).

tp.du
  • 21
  • 3
1

I had the same issue. I had put the downloaded Airflow twice, once without sudo and once with sudo. I was using with the sudo version, where the directories where under my user path. I simply ran the airflow command: export AIRFLOW_HOME=~/airflow

Jonathan
  • 46
  • 3
1

In my case, the DAG was exactly one of the default ones that I copy-pasted to check the correct volume mappings throughout the docker-compose installation. It turns out that while the web UI shows no errors, the command line airflow dag list return with the error

Error: Failed to load all files. For details, run airflow dags list-import-errors.

Which is the key to the solution:

  • the DAG was not added since it was a duplicate of an already loaded dag
Anze
  • 707
  • 9
  • 15
1

Airflow uses heuristics to pre-check if a Python file contains a graph definition, or not. It checks for the presence of strings DAG and airflow in the file. If a file doesn't contain any of those words, Airflow will ignore it. It's documented as a note in the documentation in Core Concepts / DAGs / Loading DAGs section.

The check is case insensitive since Airflow 2. This behavior can be turned off with dag-discovery-safe-mode configuration variable since Airflow 1.10.3.

maciek
  • 3,198
  • 2
  • 26
  • 33
0

Check the Paused dags. Your DAG might have ended there. If you are sure that you have added .py file correctly then manually type the url of the dag using dag_id. For e.g. http://AIRFLOW_URL/graph?dag_id=dag_id. Then you can see if Airflow has accepted your dag or not.

Nikhil Redij
  • 1,011
  • 1
  • 14
  • 21
0

I experienced the same issue. In my case, the permissions of the new DAG were incorrect.

Run ls -l to see the permissions of the new DAG. For me, the owner was listed as myself, instead of default airflow user (which in my case should have been root).

Once I changed permissions (chown root:root <file_name>), the file showed up in the Web UI immediately.

0

listing the dag or restarting the webserver didn't help me. but resetting db did.

airflow db reset
0

After reading previous answers, for me this worked:

  1. Restart the webserver, e.g. pkill -f "airflow webserver" and then airflow webserver -D.
  2. Also restart the scheduler with pkill -f "airflow scheduler" and airflow scheduler -D.

Besides that, make sure that your DAG is contained in the DAGS folder specified in airflow.cfg, located in $AIRFLOW_HOME.

This worked for me, after I could see the DAG with airflow dags list, but not in the UI, and also not trigger it.

PandaPhi
  • 157
  • 11
0

I had the same problem using WSL on Windows 10, so I had to shutdown the scheduler and the webserver, then I ran it again and worked fine...

NOTE: It seems that each time you change dags path in airflow.cfg, you have to restart the server.

0

There is an easier way than those described above.

DAGs are stored in the database and at the same time information about them is cached in the client.

You don't need reboot you server or containers with airflow You need to do a "cache flush and hard reload" of your browser page with airflow dags. For chrome it is:

F12 -> Right click on restart icon -> clear cache and hard reboot

NIT: I'll improve my answer when I find a variable like "cache lifetime" or a less hacky way to do it

MaC'kRage
  • 21
  • 4