24

Let's say a server gets 10,000 concurrent connections (via socket.io). That's a lot, and if it can't handle any more, I need to spin up another server.

How can I sync the two servers together with their socket.io?

TIMEX
  • 259,804
  • 351
  • 777
  • 1,080

2 Answers2

42

I wouldn't use Cluster to scale Socket.IO. Socket.IO 0.6 is designed as a single process server, and it uses long living connections or polling connections to achieve a real time connection between the server and client.

If you put Cluster infront of your socket.io client you will basically distribute the polling transports between different servers, who are not aware of the client. This will result in broken connections. But also broadcasting to all your clients will be a pain as they are all distributed on different servers and you don't have IPC between them.

So I would only advice to use Cluster if you only use Web Socket & Flash Socket connections and don't need to use the broadcast functionality.

So what should you do?

You could wait until socket.io 0.7 is released which is designed from the ground up to be used on multiple processes.

Or you can use pub/sub to send messages between different servers.

3rdEden
  • 4,388
  • 1
  • 20
  • 18
  • 1
    There a working version of the RedisStore available here: https://github.com/dshaw/socket.io/blob/master/lib/stores/redis.js It should become available in socket.io soon as well, but if you need to scale this is probably the fastest way. – 3rdEden Sep 12 '11 at 08:57
  • when you say redisstore. I presume you mean we store sessions in redis and then store data attached to a session authenticated socket in redis for the duration of the session livetime? The store uses the `socket.get` and `socket.set` interface. (Bonus, socket.io lacks detailed documentation) – Raynos Sep 12 '11 at 09:10
  • 2
    Yes, RedisStore. By default Socket.IO will handshakes & connection ids + socket data in the memory of the process but when you change to the redis store this data becomes available on all process as it's stored in one single place + sessionids are replicated overall processes. So they can accept each incoming poll even if they are handshaken on a different process. Anyways, the finished store landed in the github master yesterday, so you can play with it in the next release :) – 3rdEden Sep 20 '11 at 13:19
  • And yup, we are lacking on the documentation side, I'm hoping that someone from the community is stepping up and start working on some wiki's as both me and rauchg are really busy atm :9 – 3rdEden Sep 20 '11 at 13:20
  • You can use 0.6 with something like HAproxy though. You can't use roundrobin to load-balance polling transports but you can balance those based on the source(ip affinity) or even go for cookie based persistence. – Shripad Krishna Sep 23 '11 at 13:58
  • ex: `acl is_JSONPolling path_beg /socket.io/jsonp-polling` and then `use_backend non_websocket if is_JSONPolling` with backend "non_websocket" having `balance source` instead of `balance roundrobin` – Shripad Krishna Sep 23 '11 at 13:59
  • @3rdEden Do you have a plan to implement MongoStore version also? It is interesting that I can simply use it with socket.set, socket.get methods :) – jwchang Jan 06 '12 at 01:10
  • @InspiredJW I'm personally not planning of doing that, but it should be relatively easy to create. – 3rdEden Jan 08 '12 at 15:11
  • @3rdEden - has your advice changed, not that we have v.9? – UpTheCreek Apr 10 '12 at 11:52
  • 3
    @UpTheCreek You can use the RedisStore that we ship in Socket.IO https://github.com/LearnBoost/socket.io/blob/master/lib/stores/redis.js this will at least allow you to scale across multiple processes using the build in cluster functionality of Node.js – 3rdEden Apr 16 '12 at 08:46
  • @3rdEden What about scaling to multiple machines? – Evgeniy Berezovsky Jun 29 '12 at 02:49
  • @EugeneBeresovsk Answering myself: See ShripadK's first comment – Evgeniy Berezovsky Jul 10 '12 at 02:09
  • 2
    @3rdEden Any chance at updating your answer please? Looks like a lot has changed since then... Thanks! – cmcculloh Aug 02 '13 at 18:24
3

You can try to use for example cluster module and distribute the load to multiple cores (in case you have a multi-core CPU). In case this is not enough you can try to use reverse proxy for distributing requests across multiple servers and redis as a central session data store (if it's possible for your scenario).

yojimbo87
  • 65,684
  • 25
  • 123
  • 131
  • 1
    Properly written `cluster` could should scale across multiple servers with a bit of boiler plate code. – Raynos May 10 '11 at 16:58