6

Am trying to figure out how to provide zero-downtime rolling updates of a webapp that has long-lived interactive user sessions that should be sticky, based on a JSESSIONID cookie.

For this (and other) reasons I'm looking at container technology, like say Docker Swarm or Kubernetes.

I am having difficulties finding a good answer on how to:

  1. Make sure new sessions go to the latest version of the app
  2. While existing sessions remain being served by whichever version of the app they were initiated on
  3. Properly clean up the old version once all sessions to/on it were closed

Some more info:

  • Requests are linked to a session based on a JSESSIONID cookie
  • Sessions could potentially live for days, but am able to terminate them from within the app within say a 24hr timeframe (Sending the user a notification to "logout/login again as there is a new version or that they are otherwise automatically logged out at 12pm" for example)
  • Of course for each version of the app there are multiple containers already running in load-balanced fashion
  • I don't mind the number of total containers growing, for example if each of the old versions containers are all still up and running because they would all still host 1 session, while the majority of the users are already on the new version of the app

So, my idea of the required flow is something along these lines:

  1. Put up the new version of the app
  2. let all new connections (the ones w/o the JSESSIONID cookie set) go to the new version of the app once
  3. a container of the old version of the app is not serving sessions anymore, remove the container/....

As I mentioned, I'm looking into Kubernetes amd Docker Swarm, but open for other suggestions, but the end solution should be able to run on cloud platforms (currently using Azure, but Google or Amazon clouds might be used in the future)

Any pointers/tips/suggestions or ideas appreciated

Paul

EDIT: In answer to @Tarun question and general clarification: yes, I want no downtime. The way I envision this is that the containers hosting the old version will keep running to serve all existing sessions. Once all sessions on the old servers have ended, the old server is removed.

The new containers are only going to serve new sessions for users that startup the app after the rollout of the new version has started.

So, to give an example: - I launch a new session A of the old version of the app at 9am - At 10am a new version is rolled out. - I continue to use session A with remains hosted on a container running the old version. - at noon I go for lunch and log out - as I was the last session connected to the container running the old version, the container will now be destroyed - at 1pm I come back, log back in and I get the new version of the app

Makes sense?

Paul
  • 335
  • 1
  • 3
  • 16
  • Welcome to StackOverflow. This question is too broad for SO - consider reading the [How to ask](https://stackoverflow.com/help/how-to-ask) guide to raise the chance of getting good answers. – pagid Mar 20 '17 at 15:50
  • @Paul - If I understand correctly, you require that rolling update happens without/minimal downtime and the users are automatically directed to new containers.? If thats the case(unless the application has something specific which violates this), the flow that you require seems like it can be done with kubernetes very easily. – Dreams Mar 21 '17 at 05:50
  • @Paul Where are these sessions stored on the server? Or are they only stored as cookies by the client? – iamnat Mar 21 '17 at 11:37
  • @Tarun I've updated my question with an example based on your question. Hope that clarifies things. If that can be done easily with Kubernetes, could you give me some guidance? Because I haven't figured out how to do so – Paul Mar 21 '17 at 15:01
  • @iamnat: The clients store a JSESSIONID cookie and the server has a lot of state per session that is locates based on the value of the JSESSIONID cookie. Moving a session from one server to another is a no-go: the platform I'm using isn't architected to support that and most likely never will – Paul Mar 21 '17 at 15:01
  • Did you try SessionAffinity? https://kubernetes.io/docs/api-reference/v1/definitions/#_v1_service (one of the properties of the service) – kanor1306 Mar 21 '17 at 15:09
  • @kanor1306 afaik IP-based session affinity is bad, as my clients could come from the same IP. Regardless of that, sessionAffinity is only part of what is needed (making sure a client always ends up on the server that has it's session). The larger part of the question (how to do a zero-downtime rolling update) isn't solved (afaik) by just setting up session affinity – Paul Mar 21 '17 at 15:16
  • What comes to my mind is creating a deployment (a proxy) that will work as entry point of the cluster ( exposed in a NodePort together with an ingress controller), and put the logic that you describe in your issue in that proxy. Then from the outside it will look like the service is the same (same url), but in the proxy you will keep track of the versioning of the pods and the relation between sessions and service version. A headless service may do the trick, although I am not sure about this last thing. @Paul,right, the affinity does not solve the downtime. Hope that this gives you some ideas – kanor1306 Mar 21 '17 at 15:28
  • @paul - Look at this question : http://stackoverflow.com/questions/29198840/marathon-vs-kubernetes-vs-docker-swarm-on-dc-os-with-docker-containers?rq=1, if you decide to stick with kubernetes : check out the "deployment with load balancer service and how to update a deployment". That should get you started! – Dreams Mar 22 '17 at 06:45
  • @Tarun so far I tend to gravitate towards Kubernetes. When you say "check out the "deployment with load balancer service and how to update a deployment"", is there a specific piece of doc you are referring to? 'cause I've been going through the docs and from all I've read I haven't figured out if/how Kubernetes supports my specific requirement of starting a zero downtime rolling update AND only shutting down Pods running the old version only once all sessions hosted on it are finished/removed. Am I just overlooking something? – Paul Mar 22 '17 at 08:08
  • @Tarun Just to add to the above: I've looked at Deployments, which seem to be the best way to do rolling updates and I've looked the NGInx Ingress to do "Session stickyness". The bit I've haven't figured out yet is how to prevent pods from being killed due to the rolling update if they are still hosting sessions. As for preventing a pod from being killed while still serving sessions, I looked at the prestop hook, but I'm unclear if it could prevent the kill of the pod for maybe hours. And how to prevent the pods running the old version from being used to start new sessions? – Paul Mar 22 '17 at 08:39
  • From what I gather from this issue, the prestop hook isn't the way to go, as it'll not prevent the pod being shut down if the grace period expires (unless I'd set the grace period to a couple of days (if that would be supported), but that doesn't sound like a best practise) – Paul Mar 22 '17 at 08:51
  • @Paul - Sorry for the late reply. Yes, i was referring to the docs : https://kubernetes.io/docs/user-guide/deployments/ . Even, I have tried rolling updates with only gracePeriod. Since, I work on systems where each request takes very small time, i had not faced such a problem. I have not used prehooks, so cannot comment on that. My suggestion would be to ask a separate question now that you have narrowed down to a specific problem where you will get much better answers and suggestions! – Dreams Mar 24 '17 at 05:38

1 Answers1

0

Your work load might not be a good fit for Kubernetes/containers ith its current architecture. The best way I can come up to solve this is it to move the state to PV/PVC and migrate the PV to the new containers so the new container can have state from the old session, now how to migrate the calls for that session to the proper node I'm not sure how to do that efficiently.

Ideally you would separate your data/caching layer from your service into something like redis and then it wouldn't matter which of the nodes service the request.

Jack Quincy
  • 106
  • 5
  • Hi Jack, while I'd love to take out all the state from the service and/or migrate sessions from one server to another, reality is that the stack I'm using doesn't support that and moving off of the stack is a (very) long-term thing We're currently prototyping this with an custom ingress and our own upgrade/deployment mechanism, but indeed it ain't easy and doesn't utilize some of the built-in features of Kubernetes – Paul Apr 06 '17 at 08:33