10

I have a dag which checks for new workflows to be generated (Dynamic DAG) at a regular interval and if found, creates them. (Ref: Dynamic dags not getting added by scheduler )

The above DAG is working and the dynamic DAGs are getting created and listed in the web-server. Two issues here:

  1. When clicking on the DAG in web url, it says "DAG seems to be missing"
  2. The listed DAGs are not listed using "airflow list_dags" command

Error:

DAG "app01_user" seems to be missing.

The same is for all other dynamically generated DAGs. I have compiled the Python script and found no errors.

Edit1: I tried clearing all data and running "airflow run". It ran successfully but no Dynamic generated DAGs were added to "airflow list_dags". But when running the command "airflow list_dags", it loaded and executed the DAG, (which generated Dynamic DAGs). The dynamic DAGs are also listed as below:

[root@cmnode dags]# airflow list_dags
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8\nLANG=en_US.UTF-8)
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8\nLANG=en_US.UTF-8)
[2019-08-13 00:34:31,692] {settings.py:182} INFO - settings.configure_orm(): Using pool settings. pool_size=15, pool_recycle=1800, pid=25386
[2019-08-13 00:34:31,877] {__init__.py:51} INFO - Using executor LocalExecutor
[2019-08-13 00:34:32,113] {__init__.py:305} INFO - Filling up the DagBag from /root/airflow/dags

/usr/lib/python2.7/site-packages/airflow/operators/bash_operator.py:70: PendingDeprecationWarning: Invalid arguments were passed to BashOperator (task_id: tst_dyn_dag). Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: {'provide_context': True}
  super(BashOperator, self).__init__(*args, **kwargs)
-------------------------------------------------------------------
DAGS
-------------------------------------------------------------------
app01_user
app02_user
app03_user
app04_user
testDynDags

Upon running again, all the above generated 4 dags disappeared and only the base DAG, "testDynDags" is displayed.

mebius99
  • 2,495
  • 1
  • 5
  • 9
Saideep
  • 157
  • 1
  • 2
  • 10

6 Answers6

12

When I was getting this error, there was an exception showing up in the webserver logs. Once I resolved that error and I restarted the webserver it went through normally.

From what I can see this is the error that is thrown when the webserver tried to parse the dag file and there is an error. In my case it was an error importing a new operator I added to a plugin.

Don
  • 3,987
  • 15
  • 32
  • I checked the logs too. There was not issue. Later on, Changed the way I am created DAGs. I am using a template now to create DAGs. – Saideep Sep 20 '19 at 04:09
  • No problem, came across this when looking at my issue and thought I'd at least write down what I figured out for the next person who ends up here. – Don Sep 20 '19 at 14:44
  • 3
    @Don Thank you, I was the next person :) – Petro Oct 23 '19 at 12:23
  • I'm getting the same error, but none of the advice here helps. The DAG is dead simple, having only one PythonOperator, and loads without difficulty locally. No errors appear in the logs. – Throw Away Account Oct 01 '20 at 13:49
  • restarting the `webserver` helped for me, but then we don't have any understanding of the error then. If restarting can help, that means there is no error in the dag code – cryanbhu Oct 12 '20 at 02:38
  • @cryanbhu how did you restart webserver? – Sana Oct 14 '20 at 13:17
  • @Saba I have Airflow running in Kubernetes, as an Airflow Deployment, so I restarted Airflow by deleting the Airflow Pod which recreates the Pod containing `airflow scheduler` and `airflow webserver` containers. If you're not running Airflow in k8s then just restart the `airflow webserver` running process – cryanbhu Oct 15 '20 at 14:13
2

I found that airflow fails to recognize a dag defined in a file that does not have from airflow import DAG in it, even if DAG is not explicitly used in that file.

For example, suppose you have two files, a.py and b.py:

# a.py

from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator


def makedag(dag_id="a"):
    with DAG(dag_id=dag_id) as dag:
        DummyOperator(task_id="nada")


dag = makedag()

and

# b.py

from a import makedag

dag = makedag(dag_id="b")

Then airflow will only look at a.py. It won't even look at b.py at all, even to notice if there's a syntax error in it! But if you add from airflow import DAG to it and don't change anything else, it will show up.

kojiro
  • 74,557
  • 19
  • 143
  • 201
1

Usually, I check in Airflow UI, sometimes the reason of broken DAG appear in there. But if it is not there, I usually run the .py file of my DAG, and error (reason of DAG cant be parsed) will appear.

0

I never got to work on dynamic DAG generation but I did face this issue when DAG was not present on all nodes ( scheduler, worker and webserver ). In case you have airflow cluster, please make sure that DAG is present on all airflow nodes.

Relic16
  • 312
  • 2
  • 11
0

Same error, the reason was I renamed my dag_id in uppercase. Something like "import_myclientname" into "import_MYCLIENTNAME".

DavidBu
  • 478
  • 4
  • 6
0

I am little late to the party but I faced the error today:

In short: try executing airflow dags report and/or airflow dags reserialize

Check out my comment here: https://stackoverflow.com/a/73880927/4437153

rudald
  • 364
  • 5
  • 18