Currently, I'm doing a Data engineering project in which I have an EC2 instance hosting the airflow and another ec2 machine which is both a spark master and a spark worker. The problem is that the process of submitting tasks to spark is successful if it occurs in the hosting machine of the client node. But if it occurs in the docker container (I want to dockerize the airflow), then it will be failed. The evidence is that: If I open a spark-shell in the host machine of the client node, it works perfectly. However, if I do it in the docker container, it encounters the error:
Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
The docker-compose file I will left below:
version: '3.3'
x-airflow-common:
&airflow-common
build:
context: ./Docker_Container_Airflow
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW_CONN_SPARK_CON: $Spark_Con
AIRFLOW_CONN_MYSQL_CON_CITY: $MySQL_Con_City
AIRFLOW_CONN_MYSQL_CON_COUNTRY: $MySQL_Con_Country
AIRFLOW_CONN_MYSQL_CON_GLOBAL: $MySQL_Con_Global
AIRFLOW_CONN_S3_CON: $S3_Con
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
depends_on:
postgres:
condition: service_healthy
# The posgres database is used to store the metadata during the operation of the airflow webserver
services:
postgres:
container_name: postgres-airflow
image: postgres:13
healthcheck:
test: [ "CMD", "pg_isready", "-U", "airflow" ]
interval: 5s
retries: 5
restart: always
ports:
- "5432:5432"
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
scheduler:
<<: *airflow-common
container_name: airflow_scheduler
command: scheduler
healthcheck:
test:
[
"CMD-SHELL",
'airflow jobs check --job-type SchedulerJob --hostname "$${HOSTNAME}"'
]
interval: 10s
timeout: 10s
retries: 5
restart: always
webserver:
<<: *airflow-common
container_name: airflow_webserver
command: webserver
ports:
- 8080:8080
environment:
<<: *airflow-common-env
_AIRFLOW_DB_UPGRADE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'true'
_AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
_AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
healthcheck:
test:
[
"CMD",
"curl",
"--fail",
"http://localhost:8080/health"
]
interval: 10s
timeout: 10s
retries: 5
restart: always
I've also tried specify those services to the host network but the docker containers (airflow_webserver and airflow_scheduler) end up being unhealthy.
I've searched google and other platforms but I don't know what exactly causes this. I want to know what is the reason and how to resolve this.