I'm trying to set up airflow instance using docker-compose as described in official docs and I'm stuck at airflow-init part. It looks like there is no connectivity between containers, but I don't know how to fix it.
I use literally the same docker-compose.yaml
as described in docs. It can be downloaded here: https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml
Currently, I see this in my shell:
~/dwn $ docker-compose up airflow-init
51ad8448b197_dwn_redis_1 is up-to-date
70409dec742c_dwn_postgres_1 is up-to-date
Starting dwn_airflow-init_1 ... done
Attaching to dwn_airflow-init_1
airflow-init_1 | BACKEND=postgresql+psycopg2
airflow-init_1 | DB_HOST=postgres
airflow-init_1 | DB_PORT=5432
Docs says I should see somethings like this:
airflow-init_1 | Upgrades done
airflow-init_1 | Admin user airflow created
airflow-init_1 | 2.1.2
start_airflow-init_1 exited with code 0
but that command just hangs and never exits. Htop shows me that netcat is running inside this container and it is trying to connect to postgres:
nc -zvvn 172.19.0.3 5432
curl
shows timeout:
~/dwn $ docker exec -it dwn_airflow-init_1 curl postgres:5432
curl: (7) Failed to connect to postgres port 5432: Connection timed out
Why it hangs?
I tried a few things to fix this:
I tried setting
ports
option inpostgres
service to5432:5432
- no effectI tried setting
links
option - no effectOther question suggested system entropy is too low - no, there is plenty of entropy
There is enough free RAM, CPU, disk space
I tried setting network like in this answer - it is even worse, containter names aren't resolved:
~/dwn $ docker exec -it dwn_airflow-init_1 curl postgres:5432 curl: (6) Could not resolve host: postgres
I tried resetting iptables like suggested in this answer - no effect
Some system info:
- OS: Arch Linux
- docker version: 20.10.7, build f0df35096d
- docker-compose version: 1.29.2
Logs! (as requested by @larsks)
~/dwn $ docker-compose ps
Name Command State Ports
-------------------------------------------------------------------------------------------------------------
dwn_airflow-init_1 /usr/bin/dumb-init -- /ent ... Up 8080/tcp
dwn_postgres_1 docker-entrypoint.sh postgres Up (healthy) 5432/tcp
dwn_redis_1 docker-entrypoint.sh redis ... Up (healthy) 0.0.0.0:6379->6379/tcp,:::6379->6379/tcp
~/dwn $ docker-compose logs postgres
Attaching to dwn_postgres_1
postgres_1 | The files belonging to this database system will be owned by user "postgres".
postgres_1 | This user must also own the server process.
postgres_1 |
postgres_1 | The database cluster will be initialized with locale "en_US.utf8".
postgres_1 | The default database encoding has accordingly been set to "UTF8".
postgres_1 | The default text search configuration will be set to "english".
postgres_1 |
postgres_1 | Data page checksums are disabled.
postgres_1 |
postgres_1 | fixing permissions on existing directory /var/lib/postgresql/data ... ok
postgres_1 | creating subdirectories ... ok
postgres_1 | selecting dynamic shared memory implementation ... posix
postgres_1 | selecting default max_connections ... 100
postgres_1 | selecting default shared_buffers ... 128MB
postgres_1 | selecting default time zone ... Etc/UTC
postgres_1 | creating configuration files ... ok
postgres_1 | running bootstrap script ... ok
postgres_1 | performing post-bootstrap initialization ... ok
postgres_1 | initdb: warning: enabling "trust" authentication for local connections
postgres_1 | You can change this by editing pg_hba.conf or using the option -A, or
postgres_1 | --auth-local and --auth-host, the next time you run initdb.
postgres_1 | syncing data to disk ... ok
postgres_1 |
postgres_1 |
postgres_1 | Success. You can now start the database server using:
postgres_1 |
postgres_1 | pg_ctl -D /var/lib/postgresql/data -l logfile start
postgres_1 |
postgres_1 | waiting for server to start....2021-07-17 07:31:38.491 UTC [47] LOG: starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
postgres_1 | 2021-07-17 07:31:38.493 UTC [47] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres_1 | 2021-07-17 07:31:38.499 UTC [48] LOG: database system was shut down at 2021-07-17 07:31:35 UTC
postgres_1 | 2021-07-17 07:31:38.521 UTC [47] LOG: database system is ready to accept connections
postgres_1 | done
postgres_1 | server started
postgres_1 | CREATE DATABASE
postgres_1 |
postgres_1 |
postgres_1 | /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
postgres_1 |
postgres_1 | 2021-07-17 07:31:39.613 UTC [47] LOG: received fast shutdown request
postgres_1 | waiting for server to shut down....2021-07-17 07:31:39.615 UTC [47] LOG: aborting any active transactions
postgres_1 | 2021-07-17 07:31:39.616 UTC [47] LOG: background worker "logical replication launcher" (PID 54) exited with exit code 1
postgres_1 | 2021-07-17 07:31:39.616 UTC [49] LOG: shutting down
postgres_1 | 2021-07-17 07:31:39.644 UTC [47] LOG: database system is shut down
postgres_1 | done
postgres_1 | server stopped
postgres_1 |
postgres_1 | PostgreSQL init process complete; ready for start up.
postgres_1 |
postgres_1 | 2021-07-17 07:31:39.741 UTC [1] LOG: starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
postgres_1 | 2021-07-17 07:31:39.741 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
postgres_1 | 2021-07-17 07:31:39.741 UTC [1] LOG: listening on IPv6 address "::", port 5432
postgres_1 | 2021-07-17 07:31:39.748 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres_1 | 2021-07-17 07:31:39.756 UTC [75] LOG: database system was shut down at 2021-07-17 07:31:39 UTC
postgres_1 | 2021-07-17 07:31:39.781 UTC [1] LOG: database system is ready to accept connections
postgres_1 | 2021-07-17 07:33:49.955 UTC [79] LOG: using stale statistics instead of current ones because stats collector is not responding
postgres_1 | 2021-07-17 07:34:00.040 UTC [79] LOG: using stale statistics instead of current ones because stats collector is not responding
postgres_1 | 2021-07-17 07:34:00.049 UTC [235] LOG: using stale statistics instead of current ones because stats collector is not responding
postgres_1 | 2021-07-17 07:34:10.141 UTC [79] LOG: using stale statistics instead of current ones because stats collector is not responding
When I edit postgres
service to make it accessible from host (ports
option) I can see it really is there
~/dwn $ pg_isready -h localhost -p 5432
localhost:5432 - accepting connections
Here is how network created by docker-compose looks like:
[
{
"Name": "dwn_default",
"Id": "8c4e4ab1629cd7d2cb5d532e28b0837a11bc3516ba094248294e5d734a69dc11",
"Created": "2021-07-17T10:15:50.694208715+02:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.19.0.0/16",
"Gateway": "172.19.0.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"2c6dd1bcd0d81740ab17ff7816acd983ff053be2a8f886ef281b3e5ec1ec642b": {
"Name": "dwn_airflow-init_1",
"EndpointID": "945c9bd23ffb52bdee7ae9fdf32f48be623ac73cd60a5b248f919fce6aede366",
"MacAddress": "02:42:ac:13:00:04",
"IPv4Address": "172.19.0.4/16",
"IPv6Address": ""
},
"3a79a194d97e491c75e573fa78492c9d4f73efd4d868e709c20eb23c9a0ff2a6": {
"Name": "dwn_postgres_1",
"EndpointID": "b3245b8ab82edc78b205485cd39c368881d7c7b2bc29f325fd3f6f6d8605d9c1",
"MacAddress": "02:42:ac:13:00:03",
"IPv4Address": "172.19.0.3/16",
"IPv6Address": ""
},
"dd023f1d42be72d967c5045b7be29deca88caf99377e7d144c51f2212059cefa": {
"Name": "dwn_redis_1",
"EndpointID": "f85a6cd841028efb7fab17e40f814b0d9de300e90f9506df373d973695a38d97",
"MacAddress": "02:42:ac:13:00:02",
"IPv4Address": "172.19.0.2/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {
"com.docker.compose.network": "default",
"com.docker.compose.project": "dwn",
"com.docker.compose.version": "1.29.2"
}
}
]
@jarek-potiuk suggested I should check ipv6 configuration. Still won't work, but I got some errors this time. Here is what I did:
I created /etc/docker/daemon.json
with following content:
{
"ipv6": true,
"fixed-cidr-v6": "2001:db8:1::/64"
}
This caused following error (after daemon restart):
could not find an available, non-overlapping IPv6 address pool among the defaults to as sign to the network
This error can be fixed by setting network_mode: bridge
for every service in compose file and now my services have ipv6 address:
[
{
"Name": "bridge",
"Id": "092767c3c4137429a7caaa85a1b87c7cb977c4f02055624fa84c4d586ed9758f",
"Created": "2021-07-17T14:42:08.353393246+02:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": true,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
},
{
"Subnet": "2001:db8:1::/64",
"Gateway": "2001:db8:1::1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"964c9edadb8f7eb757cd7f1296c2af154ab407ef4d9872f8e613f61d64d6a443": {
"Name": "dwn_postgres_1",
"EndpointID": "a19bd83ff487611e78074eddafbca18e545edcf9ddc9d7851d3b6d68b7962419",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": "2001:db8:1::242:ac11:2/64"
},
"b45ca546c1539f5f0f1d76423bd4f071efed2e3d6e118b8811e3fd28164fab5a": {
"Name": "dwn_airflow-init_1",
"EndpointID": "3a2fc42dfda6a534b6840971f4b11af9c78aac2253a036f46721ed6e5659f7b9",
"MacAddress": "02:42:ac:11:00:04",
"IPv4Address": "172.17.0.4/16",
"IPv6Address": "2001:db8:1::242:ac11:4/64"
},
"f140d9c90c24fca254e34aec549b559ec5f82bc8b14537e7249192e604110d53": {
"Name": "dwn_redis_1",
"EndpointID": "1c26f7afa8ada58626b67e7446347e1c4d540513df72784addcf334f99fd53d1",
"MacAddress": "02:42:ac:11:00:03",
"IPv4Address": "172.17.0.3/16",
"IPv6Address": "2001:db8:1::242:ac11:3/64"
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
but there is another problem - name resolution stopped working:
~/dwn $ docker-compose up airflow-init
dwn_postgres_1 is up-to-date
dwn_redis_1 is up-to-date
Starting dwn_airflow-init_1 ... done
Attaching to dwn_airflow-init_1
airflow-init_1 | BACKEND=postgresql+psycopg2
airflow-init_1 | DB_HOST=postgres
airflow-init_1 | DB_PORT=5432
airflow-init_1 | ....................
airflow-init_1 | ERROR! Maximum number of retries (20) reached.
airflow-init_1 |
airflow-init_1 | Last check result:
airflow-init_1 | $ run_nc 'postgres' '5432'
airflow-init_1 | Traceback (most recent call last):
airflow-init_1 | File "<string>", line 1, in <module>
airflow-init_1 | socket.gaierror: [Errno -3] Temporary failure in name resolution
airflow-init_1 | Can't parse as an IP address
airflow-init_1 |
dwn_airflow-init_1 exited with code 1
This is actually documented: containers on the default bridge network can only access each other by IP, but access by IP still doesn't work:
~/dwn $ docker exec -i -t dwn_airflow-init_1 sh -c 'echo "PING" | nc -v 172.17.0.3 6379'
172.17.0.3: inverse host lookup failed: Host name lookup failure
^C
~/dwn $ echo "PING" | ncat -v localhost 6379
Ncat: Version 7.91 ( https://nmap.org/ncat )
Ncat: Connected to ::1:6379.
+PONG
Ncat: 5 bytes sent, 7 bytes received in 0.01 seconds.
I also found that disabling ipv6 at daemon level does not disable ipv6 in containers, so I tried to disable it in postgres container by setting sysctls. It works as expected:
~/dwn $ docker exec -i -t dwn_postgres_1 cat /proc/sys/net/ipv6/conf/all/disable_ipv6
1
but still no network access.
I'm out of ideas at this point.