How to resolve "failed to resolve host [elasticsearch-master-headless]"?

Question

I have deployed elasticsearch 6.5 using helm for my GKE cluster. I am getting the following error after I delete the helm deployment and install it.

I have deleted the pods services and done restart. No issue with PVC.

Error Logs :

[2019-10-18T10:14:10,021][INFO ][o.e.b.BootstrapChecks    ] [es-cluster-master-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-10-18T10:14:10,224][WARN ][o.e.d.z.UnicastZenPing   ] [es-cluster-master-0] failed to resolve host [elasticsearch-master-headless]
java.net.UnknownHostException: elasticsearch-master-headless: Name or service not known
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) ~[?:?]
        at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929) ~[?:?]


[2019-10-18T10:14:10,021][INFO ][o.e.b.BootstrapChecks    ] [es-cluster-master-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-10-18T10:14:10,224][WARN ][o.e.d.z.UnicastZenPing   ] [es-cluster-master-0] failed to resolve host [elasticsearch-master-headless]
java.net.UnknownHostException: elasticsearch-master-headless: Name or service not known
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) ~[?:?]
        at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929) ~[?:?]

Did above link provided by @HarshManvar help You with your problem? Is is the same problem, Pod sit in 1/2 state, unable to resolve any hosts? — Jakub, Oct 21 '19 at 11:26

Zenul_Abidin · Answer 1 · 2022-02-27T07:18:29.223

DNS resolution between your nodes is not working, so you must deploy the Calcio pod network onto your cluster. Other pod networks may not work.

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Recreating the cluster with kubeadm is required before you do this.

You should also disable the firewalld systemd service on all worker nodes and control planes, as it appears that even with the k8s ports whitelisted, it is preventing Elasticsearch POST/PUT operations although pinging the root URI "/" returns the Elasticsearch information normally.

Before disabling firewalld, I tried unblocking the port mentioned in https://medium.com/platformer-blog/kubernetes-on-centos-7-with-firewalld-e7b53c1316af, however that only allowed Elasticsearch info URI to work.

Lastly I would flush iptables rules before re-creating the k8s cluster (Source: https://stackoverflow.com/a/57800320/12452330)

systemctl stop kubelet # or microk8s stop
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet # or microk8s start
systemctl start docker

After doing all that, verify that DNS is working inside your cluster by running kubectl run -it --rm --image=infoblox/dnstools dns-client to spin up the dnstools container and then run the command dig google.com inside the resulting shell.

Note: If you see this error in your elasticsearch logs after rebuilding your cluster:

org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: This node previously joined a cluster with UUID [BPT3dt-QQZW4eKaqIJlx3g] and is now trying to join a different cluster with UUID [qeS1rK3qS4qZDj7lTV_8ig]. This is forbidden and usually indicates an incorrect discovery or cluster bootstrapping configuration. Note that the cluster UUID persists across restarts and can only be changed by deleting the contents of the node's data path [/usr/share/elasticsearch/data] which will also remove any data held by this node.

Then it means you have to delete the contents of the folder that's mounted at /usr/share/elasticsearch and then restart the pod (in my case that's where I mounted my folder in my pod YAML file, yours may be different and you may have to read the Helm file for your Elasticsearch pods for this information - a good first place to check is /mnt/disk/elasticsearch).

How to resolve "failed to resolve host [elasticsearch-master-headless]"?

1 Answers1