41

I'm trying to import a local module (a python script) to my DAG.

Directory structure:

airflow/
├── dag
│   ├── __init__.py
│   └── my_DAG.py
└── script
    └── subfolder
        ├── __init__.py
        └── local_module.py

Sample code in my_DAG.py:

#trying to import from local module
from script.subfolder import local_module  

#calling a function in local_module.py  
a = some_function()  

I get an error in Airflow saying 'Broken DAG: my_DAG. No module named 'local_module'.

I've updated Airflow to 1.9.0 but this doesn't fix the issue.

  • What is the solution here?
  • I also read somewhere that I could solve this by creating a plugin. Can anyone point to how I can do this?

Thanks.

Oliver W.
  • 13,169
  • 3
  • 37
  • 50
hotchocolate
  • 435
  • 1
  • 4
  • 7

3 Answers3

10

This usually has to do with how Airflow is configured.

In airflow.cfg, make sure the path in airflow_home is correctly set to the path the Airflow directory strucure is in.

Then Airflow scans all subfolders and populates them so that modules can be found.

Otherwise, just make sure the folder you are trying to import is in the Python path: How to use PYTHONPATH

tobi6
  • 8,033
  • 6
  • 26
  • 41
  • 10
    My `airflow.cfg` _has_ no `airflow_home`. Has this changed since this answer was posted? – LondonRob Feb 04 '20 at 15:34
  • 2
    You can also achieve this by setting `export AIRFLOW_HOME=path_to_my_airflow_home_dir` . – J.J. May 12 '20 at 05:06
  • would appreciate links to docs, especially since these answers can get stale as the versions go up – Joey Baruch Jul 21 '21 at 13:39
  • https://airflow.apache.org/docs/apache-airflow/stable/modules_management.html#additional-modules-in-airflow according to the link Airflow only adds "dag", "plugins" and "config" to "sys.path", "airflow_home" will not be in the path – shlomiLan Aug 13 '21 at 06:48
4

The way I do it is as following:

  1. create a Python script in your sub-folder with a main() function.
  2. in your dag file include a path declaration for the sub-folder and the file

Now you can use this script in your PythonOperator

import sys
sys.path.insert(0,"/root/airflow/dags/subfolder"))
import subfolder.script_name as script
...    
t1=PythonOperator(
    task_id='python_script',
    python_callable=script.main,
    dag=dag
)
gogaz
  • 2,323
  • 2
  • 23
  • 31
Shaby
  • 89
  • 3
  • 7
4

If you run Airlow in a docker then you need to do it as following:

  1. Create a folder for your modules in dags folder. For example programs
  2. Use it as following (this is the correct path for docker):
import sys
sys.path.append('/opt/airflow/dags/programs/my_module')
import my_module
task1 = PythonOperator(
        task_id='my_task_name',
        python_callable=my_module.my_func,
        dag=dag,
    )
RafaelJan
  • 3,118
  • 1
  • 28
  • 46
  • 1
    i test following code and t works for me with some changes: i changed " sys.path.append('/opt/airflow/dags/programs/my_module') " to " sys.path.append('/opt/airflow/dags/programs/') " it seems there is no need to add python filename at the end of path – Ahmad Karrabi Feb 27 '22 at 12:37
  • not very nice but works, instead of path I use: `sys.path.append(os.path.dirname(os.path.abspath(__file__)))` – KIC Mar 01 '22 at 16:55