11

I am using Kubernetes v1.20.10 baremetal installation. It has one master node and 3 worker nodes. The application simply served HTTP requests.

I am scaling the deployment based on the (HPA) Horizontal Pod Autoscaler and I noticed that the load is not getting evenly across pods. Only the first pod is getting 95% of the load and the other Pod is getting very low load.

I tried the answer mentioned here but did not work : Kubernetes service does not distribute requests between pods

Anthony Vinay
  • 513
  • 5
  • 17

2 Answers2

10

Based on the information provided I assume that you are using http-keepalive which is a persistent tcp connection. A kubernetes service distributes load for each (new) tcp connection. If you have persistent connections, only the additional connections will be distributed which is the effect that you observe.

Try: Disable http keepalive or set the maximum keepalive time to something like 15 seconds, maximum requests to 50.

Thomas
  • 11,272
  • 2
  • 24
  • 40
1

If the connection is long-lived, The client will use the same pod throughout the life cycle of the connection. Only the new connections will be distributed in a round-robin manner. If your connection is long-lived, you can handle load balancing on the client side or delegate the responsibility of load balancing to a reverse proxy like traefik ingress in order to distribute requests in a round-robin manner.

harnoor
  • 57
  • 5