1
➜ k get pods -n edna    
NAME                              READY   STATUS    RESTARTS   AGE
airflow-79d5f59644-dd4k7          1/1     Running   0          16h
airflow-worker-67bcf7844b-rq7r8   1/1     Running   0          22h
backend-65bcb6546-wvvqj           1/1     Running   0          2d16h

so airflow running in airflow-79d5f59644-dd4k7 pod is trying to get logs extracted from the airflow worker (celery/python, which runs a simple flask based webserver handling logs) and it can't because domain name airflow-worker-67bcf7844b-rq7r8 is not resolved inside airflow-79d5f59644-dd4k7

*** Log file does not exist: /usr/local/airflow/logs/hello_world/hello_task/2020-07-14T22:05:12.123747+00:00/1.log
*** Fetching from: http://airflow-worker-67bcf7844b-rq7r8:8793/log/hello_world/hello_task/2020-07-14T22:05:12.123747+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-67bcf7844b-rq7r8', port=8793): Max retries exceeded with url: /log/hello_world/hello_task/2020-07-14T22:05:12.123747+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd37d6a9790>: Failed to establish a new connection: [Errno -2] Name or service not known'))

How can I make this work?

I understand that Airflow has remote logging to s3, but Is there a way to route requests by POD random hasotnames?

I have created a NodeType Service, but airflow has NO idea about DNS name for that service and is trying to access logs by hostname of the airflow worker (reported back by Celery).

➜ k get pods -n edna
NAME                              READY   STATUS    RESTARTS   AGE
airflow-79d5f59644-dd4k7          1/1     Running   0          16h
airflow-worker-67bcf7844b-rq7r8   1/1     Running   0          22h
backend-65bcb6546-wvvqj           1/1     Running   0          2d17h

kubectl get pods -n edna -l app=edna-airflow-worker \                                                                        
    -o go-template='{{range .items}}{{.status.podIP}}{{"\n"}}{{end}}'
'Tipz:' kgp -n edna -l app=edna-airflow-worker \ -o go-template='{{range .items}}{{.status.podIP}}{{" "}}{{end}}'
10.0.101.120

Get inside the airflow-79d5f59644-dd4k7 pod

k exec -ti -n edna airflow-79d5f59644-dd4k7 bash  

  [DEV] airflow-79d5f59644-dd4k7 app # curl -L http://airflow-worker-67bcf7844b-rq7r8:8793/log/hello_world/hello_task/2020-07-14T21:59:01.400678+00:00/1.log

curl: (6) Could not resolve host: airflow-worker-67bcf7844b-rq7r8; Unknown error
  [DEV] airflow-79d5f59644-dd4k7 app # curl -L http://10.0.101.120:8793/log/hello_world/hello_task/2020-07-14T21:59:01.400678+00:00/1.log
[2020-07-14 21:59:07,257] {{taskinstance.py:669}} INFO - Dependencies all met for <TaskInstance: hello_world.hello_task 2020-07-14T21:59:01.400678+00:00 [queued]>
[2020-07-14 21:59:07,341] {{taskinstance.py:669}} INFO - Dependencies all met for <TaskInstance: hello_world.hello_task 2020-07-14T21:59:01.400678+00:00 [queued]>
[2020-07-14 21:59:07,342] {{taskinstance.py:879}} INFO - 
--------------------------------------------------------------------------------
[2020-07-14 21:59:07,342] {{taskinstance.py:880}} INFO - Starting attempt 1 of 1
[2020-07-14 21:59:07,342] {{taskinstance.py:881}} INFO - 
--------------------------------------------------------------------------------
[2020-07-14 21:59:07,348] {{taskinstance.py:900}} INFO - Executing <Task(PythonOperator): hello_task> on 2020-07-14T21:59:01.400678+00:00
[2020-07-14 21:59:07,351] {{standard_task_runner.py:53}} INFO - Started process 5795 to run task
[2020-07-14 21:59:07,912] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: hello_world.hello_task 2020-07-14T21:59:01.400678+00:00 [running]> airflow-worker-67bcf7844b-rq7r8
[2020-07-14 21:59:07,989] {{logging_mixin.py:112}} INFO - Hello world! This is really cool!
[2020-07-14 21:59:07,989] {{python_operator.py:114}} INFO - Done. Returned value was: Hello world! This is really cool!
[2020-07-14 21:59:08,161] {{taskinstance.py:1065}} INFO - Marking task as SUCCESS.dag_id=hello_world, task_id=hello_task, execution_date=20200714T215901, start_date=20200714T215907, end_date=20200714T215908
[2020-07-14 21:59:17,070] {{logging_mixin.py:112}} INFO - [2020-07-14 21:59:17,070] {{local_task_job.py:103}} INFO - Task exited with return code 0
  [DEV] airflow-79d5f59644-dd4k7 app # 


DmitrySemenov
  • 9,204
  • 15
  • 76
  • 121

2 Answers2

5

Solution

Provide the following ENV AIRFLOW__CORE__HOSTNAME_CALLABLE for the deployment.yaml of the worker pod:

env:
  - name: AIRFLOW__CORE__HOSTNAME_CALLABLE
    value: 'airflow.utils.net:get_host_ip_address'

Or just change airflow.cfg

then airflow is trying to access by IP of the POD and things are working if your POD exposes port 8793

DmitrySemenov
  • 9,204
  • 15
  • 76
  • 121
  • 2
    This also solved my `*** Could not read served logs: [Errno -2] Name or service not known` in Airflow 2.6.0! – Zach May 03 '23 at 17:54
1

If a pod A wants to communicate to pod B, pod B has to be exposed with a service svc-b.

Then pod A can communicate with the svc-b.

This is the rule.

For your case, pod B is airflow-worker-67bcf7844b-rq7r8 :

kubectl -n edna expose airflow-worker-67bcf7844b-rq7r8 \
  --name airflow-worker \
  --port 8793 \
  --target-port 8793

Now feel free to use curl -L http://airflow-worker:8793/ instead of curl -L http://airflow-worker-67bcf7844b-rq7r8:8793/

Abdennour TOUMI
  • 87,526
  • 38
  • 249
  • 254
  • I did this, but airflow has no idea that such service (and DNS name) exists. And tried to access by hostname of the POD which is reported by celery. – DmitrySemenov Jul 14 '20 at 23:37
  • What does the following command show `kubectl -n edna exec -it airflow-worker-67bcf7844b-rq7r8 -- curl -L http://localhost:8793/log/hello_world/hello_task/2020-07-14T21:59:01.400678+00:00/1.log ` ? Please copy/paste it as it is. – Abdennour TOUMI Jul 14 '20 at 23:45
  • The output of the command https://gist.github.com/githubsaritasa/18ac363fb584aeb2748ee39116962a9b – DmitrySemenov Jul 14 '20 at 23:48
  • I am running inot the same issue as @DmitrySemenov – j7skov Feb 08 '22 at 14:28