OpenShift service with sessionAffinity forwards traffic to two pods

Question

OpenShift Container Platform 3.11

Assume a setup with one client pod and three equal server pods in the same namespace. The server pods are available via a service:

  apiVersion: v1
  kind: Service
  metadata:
    name: server
  spec:
    ports:
    - name: "8200"
      port: 8200
      targetPort: 8200
    selector:
      test.service: server
    sessionAffinity: ClientIP
    sessionAffinityConfig:
      clientIP:
        timeoutSeconds: 10800 # default

The sessionAffinity: ClientIP states that as long as the client has the same IP its requests are forwarded to the same server pod (except when the timeoutSeconds are reached). This setup worked as expected for months, until suddenly the requests were distributed between two server pods. Restarting the client pod temporarily solved the problem and the requests were forwarded to one server pod only for some time. However, after a few days, the same problem occurred again.

My question: Is there anything regarding OpenShift services and sessionAffinity: ClientIP that explaines why requests from the same client with an unchanged IP might be "suddenly" distributed between two server pods?

Some additional context:

The client pod receives a session token (not a cookie) when it connects to a server pod. The session token is cached inside the server pod, but is not shared between server pods. Therefore, when the client connects to a different server, it would receive a permission denied for the session token. The client then requests a new session token. If the client's requests are forwarded to the same server pod and only sometimes the server changes (e.g. because the first server crashed) the above setup works fine. However, if the client's requests are distributed between two or more servers, the session token will be invalid with every second or third request.

score 0 · Answer 1 · answered May 14 '21 at 11:44

Looking at the Kubernetes proxysocket source, we assume that a long connection time (above 250 ms) triggers the selection of a new endpoint.

Instead of distributing client connections between the servers via an OpenShift service, we now use an additional nginx pod between client and servers.

OpenShift service with sessionAffinity forwards traffic to two pods

1 Answers1