How do websocket connections work through a load balancer?

Question

Forgive my ignorance as my experience with load balancers and websockets are limited. I'm trying to understand how a client can connect to a cluster of servers that sits behind a load balancer through websockets.

My understanding of load balancers is that they're like reverse proxies. They take requests from clients, route them to a server in a cluster, when the server replies back to the load balancer, the load balancer relay that information to the client. In this regard, they're like a middleman that plays telephone between a server and a client.

Now add websockets to the mix. If the client is trying to communicate through websockets. Wouldn't the load balancer need to open 2 websockets connections? One with the client, and one with the server? That doesn't sound like it would scale, unless there's a cluster of load-balancers as well.

My second guess is that load balancers don't truly "relay" information, they're simply a router that gives the client the IP of a server, and the communication after that happens between the server and the client directly.

Information regarding this seen to all overlook this part of the explanation. I would really appreciate it if someone can explain what I'm missing.

Sumit Kesarwani · Answer 1 · 2020-12-09T07:41:43.490

Load balancer and reverse proxy have diffrent use case.

Main use case of the load balancer is to distribute the load among node in a group of the server to manage the resource utilisation of each node
One of the use cases of a reverse proxy is to hide server meta information (ip,port etc..) from the client. It's some sort of security.

We can configure the reverse proxy with load balancer or we can configure the reverse proxy alone as well.

Configuring the load balancer for a single node doesn't make sense but we can configure the reverse proxy for a single node.

Handling the WebSocket load or distributing the load in websocket nodes (in a cluster) is a very complicated implementation because of its use case and limitation.

Why it's complicated:

WebSocket is sticky connections, once you connected you will remain connected as long as your application is active
you have a limit to open WebSocket connections in the server (default is 63k). you can scale it more by some setting in kernal level, then you need to compromise with your system resources

Clarifying your doubt: if you put WebSocket behind the load balancer, then WebSocket will communicate through a load balancer from the client and the client will communicate with a load balancer.

if two WebSocket connection will open (Assuming one node in load balancer) for each client request then there is no use of load balancer and lead inconsistent response (if you think, how you chat app is working)

Where you will put a load balancer in case of distributing the load to the WebSocket servers :

If you put your load balancer in L3 (Network layer) then your request will distribute according to your IP address. service at the network layer will generate a hash of your IP address and send the request to the respective WebSocket server (Consistent hashing). network layer won't maintain the state of the request

If you put your load balancer in L7(Application layer) then the load balancer has to maintain a state (which source IP-port pair is going to which backend node). which bad for resources.

I hope i clairify some of you doubt

I would suggest looking at some chat message system design architecture, HTTP (Keep-Alive) vs Websocket, How to pub/sub with WebSocket (it scale WebSocket very well) This blog has good approach on how to scale WebSocket for multiple users: https://hackernoon.com/scaling-websockets-9a31497af051

Don't miss nginix, its nice site for distributed system design

What is the source of your statement about the limit on websockets conncetion that you mention (~63k)? — magom001, Aug 15 '23 at 11:36

How do websocket connections work through a load balancer?

1 Answers1