I have configured my GKE environment with high resources (separately in node pools too) and the DAGs are set with the KubernetesPodOperator to only launch pods in those node pools.
affinity={
'nodeAffinity': {
'requiredDuringSchedulingIgnoredDuringExecution': {
'nodeSelectorTerms': [{
'matchExpressions': [{
'key': 'cloud.google.com/gke-nodepool',
'operator': 'In',
'values': [
'spawning-pool'
]
}]
}]
}
}
}
My airflow.cfg
has also been modified to raise the concurrency for all the various (confusing) airflow config parameters:
parallelism = 100
dag_concurrency = 100
max_active_runs_per_dag = 100
However, many of my active DAGs have their tasks in 'Queued' state and are not starting:
Do I have to restart Composer to trigger the airflow.cfg
changes or something else I am missing?
EDIT:
just a thought, but maybe this will give some ideas to resolving the bug.
I have been modifying my dag.py files whilst there are tasks running (i.e. I have CI/CD flowing into Composer's GCS dags
bucket).
Could it be that the DAG is only re-parsed when no tasks are running for that dag?
So the DAG code mentioning to use the new node pool is not parsed as there are DAG tasks running.
Composer and Airflow versions:
In the Composer console in GCP it says:
Image version: composer-1.7.2-airflow-1.10.2