6

I have a node.js tcp server that is used as a backend to an iPhone chat client. Since my implementation includes private group chats I store a list of users and what chat room they belong to in memory in order to route messages appropriately. This all works for fine assuming my chat server will always be on one machine, but when/if I need to scale horizontally I need a good way of broadcasting messages to clients that connect to different servers. I don't want to start doing inter-process communication between node servers and would prefer sharing state with redis.

I have a few ideas but I'm wondering if anyone has a good solution for this? To be clear here is an example:

User 1 connects to server 1 on room X, user 2 connects to server 2 on room X. User 1 sends a message, I need this to be passed to user 2, but since I am using an in memory data structure the servers don't share state. I want my node servers to remain as dumb as possible so I can just add/remove to the needs of my system.

Thanks :)

brandizzi
  • 26,083
  • 8
  • 103
  • 158
Emmanuel P
  • 185
  • 1
  • 2
  • 8

2 Answers2

3

You could use a messaging layer (using something like pub/sub) that spans the processes:

                             Message Queue
-------------------------------------------------------------------------------
            |                                     |
         ServerA                               ServerB
         -------                               -------
Room 1: User1, User2                  Room 1: User3, User5
Room 2: User4, User7, User11          Room 2: User6, User8
Room 3: User9, User13                 Room 3: User10, User12, User14

Let's say User1 sends a chat message. ServerA sends a message on the message queue that says "User1 in Room 1 said something" (along with whatever they said). Each of your other server processes listens for such events, so, in this example, ServerB will see that it needs to distribute the message from User1 to all users in its own Room 1. You can scale to many processes in this way--each new process just needs to make sure they listen to appropriate messages on the queue.

Redis has pub/sub functionality that you may be able to use for this if you're already using Redis. Additionaly, there are other third-party tools for this kind of thing, like ZeroMQ; see also this question.

Community
  • 1
  • 1
Michelle Tilley
  • 157,729
  • 40
  • 374
  • 311
  • That's a fairly good solution but the message queue will be the bottleneck while every message has to go through the message queue. Wit a small optimization it's a best-practise: Try to ensure that users in the same room will be on the server, so you don't need the message queue. If rooms grow and need to be splitted to a bunch of servers you need to use the message queue. But small channels will not need it. – Tobias P. Sep 26 '11 at 09:36
  • This would require quite a bit of rework in his architecture - I think he's looking for a solution that's invisible to the node logic. – UpTheCreek Mar 29 '12 at 13:22
0

Redis is supposed to have built in cluster support in the near future, in the mean time you can use a consistent hashing algorithm to distribute your keys evenly across multiple servers. Someone out there has a hashing module for node.js, which was written specifically to implement consistent hashing for a redis cluster module for node.js. You might want to key off the 'room' name to ensure that all data points for a room wind up on the same host. With this type of setup all the logic for which server to use remains on the client, so your redis cluster can basically remain the same and you can easily add or remove hosts.

Update

I found the consistent hashing implementation for redis I was talking about, it gives the code of course, and also explains sharding in an easy to digest way.

http://ngchi.wordpress.com/2010/08/23/towards-auto-sharding-in-your-node-js-app/

profitphp
  • 8,104
  • 2
  • 28
  • 21
  • 1
    Right now I am not to worried about sharding my redis cluster. I have a fairly large shared instance to operate with. My issue is more on the node.js side. Right now I store a user's socket object in memory so that when a message comes in on a particular room I can pull all sockets pertaining to that room and write to them. The issue becomes that this in memory ds isn't shared between the node instances, so I need some way of managing state elsewhere. Does this make sense? – Emmanuel P Sep 25 '11 at 23:05
  • Ya makes sense, I guess I just assumed when you said you stored it in memory you meant that data was stored in redis. Sharding could still help you in this situation, it's looking like you're going to need to rethink the architecture a little bit though. – profitphp Sep 25 '11 at 23:11