Supose I need to implement a web application that will have a high volume of concurrent users. I decide to use node.js because it scales very well, it has good performance, open source community, etc, etc. Then, to avoid bottlenecks because I could have gazillions of users in the same event loop I decide to use a cluster of processes to take advantage of the multi-core CPU. Furthermore, I have 3 machines (main + 2) because I need to manipulate big-data with Cassandra. Awesome, this means I have 3*n node.js processes where n is the number of cores of the cpu (machines are identical).
Ok, then I start a research and I end with the following schema:
- Nginx listening on port 80 and used only to serve static content (img, css, js, etc).
Forwards the dynamic traffic to haproxy. I know how to configure nginx but I still have to take a look to haproxy, so I'll say that haproxy is listening on port 4000. Nginx and haproxy are installed in the main machine (the entry point). - Haproxy load balances between the 3 machines. It forwards traffic to port 4001, that is, the node.js processes are listening to 4001.
- Every node.js has a cluster of n processes listening to 4001.
If I'm correct a single http request will be forwarded to a single node.js process.
Creating a session is quite normal, right? A session is just a map, and this map is an Object, and this Object lives in a node.js process. Haproxy will be configured with a round-robin scheduler, so the same user can be forwarded to different node.js processes. How can I share the same session object across all the node.js processes? How can I share a global object (this includes in the same machine (node.js cluster) and across the network)? How should I design a distributed web app with node.js? Are there any modules that ease the synchronization tasks?