1

I have Kubernets 1.20.1 cluster with single master and single worker configured with ipvs mode. Using calico CNI calico/cni:v3.16.1. Cluster running on OS RHEL 8 kernel 4.18.0-240.10 with firewalld and selinux disabled.

Running one netshoot pod (10.1.30.130) on master and another pod (10.3.65.132) in worker node.

  1. I can ping both pod, in both direction
  2. if run the nc command in web server mode, connection is not working. I tried to run nginx on both server, not able get http traffic one server from another server.

Ran the tcpdump on both servers tcpdump -vv -nn -XX -i any host <PODIP> I can see ping traffic going to both nodes, but TCP traffic not reaching the other node.

iptables -vL | grep DROP command not showing any packet drop on both nodes.

I don't know where the TCP traffic getting lost, need some tips to troubleshoot this issue.

Master node iptables-save command output

# Generated by iptables-save v1.8.4 on Sat Jan 16 18:52:50 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-NODE-PORT - [0:0]
:KUBE-LOAD-BALANCER - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose" -m set --match-set KUBE-LOOP-BACK dst,dst,src -j MASQUERADE
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully
-A KUBE-SERVICES ! -s 10.0.0.0/14 -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m addrtype --dst-type LOCAL -j KUBE-NODE-PORT
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A KUBE-FIREWALL -j KUBE-MARK-DROP
-A KUBE-LOAD-BALANCER -j KUBE-MARK-MASQ
COMMIT
# Completed on Sat Jan 16 18:52:50 2021
# Generated by iptables-save v1.8.4 on Sat Jan 16 18:52:50 2021
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-FORWARD - [0:0]
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Sat Jan 16 18:52:50 2021
# Generated by iptables-save v1.8.4 on Sat Jan 16 18:52:50 2021
*mangle
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:KUBE-KUBELET-CANARY - [0:0]
COMMIT
# Completed on Sat Jan 16 18:52:50 2021

Worker iptables-save output

# Generated by iptables-save v1.8.4 on Sat Jan 16 18:53:58 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-NODE-PORT - [0:0]
:KUBE-LOAD-BALANCER - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully
-A KUBE-SERVICES ! -s 10.0.0.0/14 -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m addrtype --dst-type LOCAL -j KUBE-NODE-PORT
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A KUBE-FIREWALL -j KUBE-MARK-DROP
-A KUBE-LOAD-BALANCER -j KUBE-MARK-MASQ
COMMIT
# Completed on Sat Jan 16 18:53:58 2021
# Generated by iptables-save v1.8.4 on Sat Jan 16 18:53:58 2021
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-FORWARD - [0:0]
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Sat Jan 16 18:53:58 2021
# Generated by iptables-save v1.8.4 on Sat Jan 16 18:53:58 2021
*mangle
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:KUBE-KUBELET-CANARY - [0:0]
COMMIT
# Completed on Sat Jan 16 18:53:58 2021
sfgroups
  • 18,151
  • 28
  • 132
  • 204
  • Did you run `tcpdump` on `nodes` ? If it doesn't even reach the second node you need to focus on examining the outgoing traffic as apparently it never reaches the target node. Are you able to establish connection with `netcat` listening on the first node from the second one ? It doesn't work only from pod ? Did you try examining logs of calico pod ? – mario Jan 18 '21 at 22:37
  • @mario yes with `tcpdump` I can see ping traffic on both node. for `TCP` connection like netcat or ngnix request, I can see only in the source node, not reaching to other node. same setup work on `RHEL 7`, this issue only on `RHEL 8`. – sfgroups Jan 18 '21 at 23:07
  • What about your VMs network configuration ? Do they have more than one network interface ? You may additionally check in calico configuration what NIC it uses. Just make sure it doesn't use NAT interface, if you have one configured. If it's really something OS-specific, you'll need to investigate how your both setup differ. Has it stopped working after the OS upgrade ? Or maybe you still have this `RHEL 7` setup running and can check it for differences with what is currently set up on `RHEL 8` ? – mario Jan 30 '21 at 14:32
  • @mario my VM has only one interface, there is no `NAT` configuration. both VMs are in same frame and same subnet. only OS and kernel version is different. – sfgroups Jan 30 '21 at 17:17
  • This issue seems to be related with your [other question](https://stackoverflow.com/questions/65744565/kubernetes-dns-lookg-not-working-from-worker-node-connection-timed-out-no-ser). Did you manage to make it work on **RHEL 8** ? [This answer](https://stackoverflow.com/a/65992075/11714114) seems like a good explanation for both of the issues, you described and I saw you posted another article with workaround on **RHEL 8**. – mario Apr 08 '21 at 13:34
  • @mario yes issue is fixed. I have to added this line in `ETHTOOL_OPTS="-K ens192 tx-udp_tnl-csum-segmentation off; -K ens192 tx-udp_tnl-segmentation off"` in `/etc/sysconfig/network-scripts/ifcfg-ens192` file. I am using VMware VM. – sfgroups Apr 08 '21 at 16:17
  • Great, could you post it as an answer then ? It may serve other people who will come across a similar issue in the future. – mario Apr 08 '21 at 16:28

1 Answers1

2

I was able to resolve this issue by running below command on ens192 interface on VMware VM on.

# cat /etc/sysconfig/network-scripts/ifcfg-ens192 | grep ETHTOOL
ETHTOOL_OPTS="-K ens192 tx-udp_tnl-csum-segmentation off; -K ens192 tx-udp_tnl-segmentation off"

got the tips from here: https://github.com/kubernetes-sigs/kubespray/issues/7268 Thanks

SR

sfgroups
  • 18,151
  • 28
  • 132
  • 204