I have 3 containers images that would run my workload.
(each of these expects these file in its own file system)
- Container 1 generates file_1
- Container 2 consume file_1 and generates file_2
- Container 3 consume file_1 and file_2 and generate file_3
So airflow tasks would be:
So container 1 >> container 2 >> container 3
I want to use the KubernetesPodOperator for airflow to take advantage of auto-scaling options for airflow running in kubernetes. But since a KubernetesPodOperator create one pod per task, and each of these are their own tasks, how can I pass these files around?
I can modify the source code in each container to be aware of an intermediate location like s3 to upload files, but is there a way to built in airflow way of doing this without modifying the source workers?