0

We have a python project structure as following, airflow is a new:

├── python
│   ├── airflow
│   │   ├── airflow.cfg
│   │   ├── config
│   │   ├── dags
│   │   ├── logs
│   │   ├── requirements.txt
│   │   └── webserver_config.py
│   ├── shared_utils
│   │   ├── auth
│   │   ├── datadog
│   │   ├── drivers
│   │   ├── entities
│   │   ├── formatter
│   │   ├── helpers
│   │   └── system
...

We have several other package same level as shared_utils, some are common libraries and some are standalone backend services.

We want to keep airflow part independent and meanwhile to benefit the common libraries. We have python folder in PYTHONPATH, python/airflow is in PYTHONPATH as well (currently airflow doesn't import any code from other package).

I am wondering how can I call code from shared_utils in my airflow dags, or how should I organize the project structure to make it possible?

UPDATE:

it seems there is no conflict when I set python and python/airflow both in PYTHONPATH, after add requirements from shared_utils to airflow, it does work as expected.

seaguest
  • 2,510
  • 5
  • 27
  • 45
  • How to import functions from other projects in Python? https://stackoverflow.com/questions/14509192/how-to-import-functions-from-other-projects-in-python – Ronin Jan 03 '23 at 19:12
  • How to share code between python projects? https://stackoverflow.com/questions/48954870/how-to-share-code-between-python-projects – Ronin Jan 03 '23 at 19:13
  • @Ronin please see my update, python and python/airflow are both in PYTHONPATH, maybe I should move airflow out of python. – seaguest Jan 04 '23 at 08:06
  • isn't that the answer? https://stackoverflow.com/a/48964114/2316519 – Ronin Jan 04 '23 at 08:32
  • check links in this answer ... – Ronin Jan 04 '23 at 08:38

2 Answers2

0

I have a project where I have this layout for the project

|
|-- dags/
|---- dag.py
|-- logs/
|-- plugins/
|---- __init__.py
|---- core.py
|-- airflow.cfg

And then I keep the core stuff in the core.py.

When I want to use the code in the core.py file then I will in the dag.py do the following:

from core import <some function>

Note:

This is my airflow.cfg file and it registers the plugins folder so PythonVirtualOperator can find the code in the plugins.

[core]
dags_folder = {AIRFLOW_HOME}/dags


plugins_folder = {AIRFLOW_HOME}/plugins%

TLDR;

So for your case, I would imagine you can do like this in the airflow.cfg:

plugins_folder = {AIRFLOW_HOME}/shared_utils%
0

You can just move your shared_utils to a new folder my_package in python folder, then add the my_package path to your python path:

# in your host
echo export PYTHONPATH="/path/to/python/my_package:$PYTHONPATH" >> ~/.profile
# in airflow docker image
ENV PYTHONPATH="/path/to/python/my_package"

Now you can import from your package in all the python consoles:

from shared_utils.auth import module_x
Hussein Awala
  • 4,285
  • 2
  • 9
  • 23