I'm thinking about writing a few web applications having almost the same requirements as a chat. And I would like them to be able to scale easily.
I have worked a bit with node.js and I understand how it can help design push applications but I have some difficulties when thinking about having them run on multiple servers.
Here are some design I can think of for a large scale chat app :
1 - Servers have state, they keep the connections opened and clients can have new messages pushed to them. In this scenario, we are limited by the physical memory of one server so we cannot scale linearly if we have too many users per room.
2 - Servers have no state, they request a distributed database to respond to clients requests. In this scenario, clients poll the servers. We could scale linearly but the throughput is decreased, the messages are not delivered instantly and polling has been shown as a bad practice when scaling.
3 - Mix of 1 and 2. Servers keep the connections of its clients opened and poll the distributed database. The application is more complex to write and we still use polling. Similar client's requests (clients of the same room) are just grouped into a single one done by the server. The code becomes unnecessary complicated and it does not scale in the situation where we have many rooms and a few users per room.
4 - Servers have no state and the database cluster uses event to notify every registered servers about new messages. This is the solution I would like to have but I haven't heard of any database which has this feature. (Some people are talking about this feature for mongodb here: https://jira.mongodb.org/browse/SERVER-124)
So Why is the 4th solution not used so much today?
How do people usually design their applications in this case?