I'm making an online browser game with websockets and a node server and if I have around 20-30 players, the CPU is usually around 2% and RAM at 10-15%. I'm just using a cheap Digital Ocean droplet to host it.
However, every 20-30 minutes it seems, the server CPU usage will spike to 100% for 10 seconds, and then finally crash. Up until that moment, the CPU usually hovering around 2% and the game is running very smoothly.
I can't tell for the life of me what is triggering this as there are no errors in the logs and nothing in the game that I can see causes it. Just seems to be a random event that brings the server down.
There are also some smaller spikes as well that don't bring the server down, but soon resolve themselves. Here's an image:
I don't think I'm blocking the event loop anywhere and I don't have any execution paths that seem to be long running. The packets to and from the server are usually two per second per user, so not much bandwidth used at all. And the server is mostly just a relay with little processing of packets other than validation so I'm not sure what code path could be so intensive.
What can I do to profile this and find out where to begin in how to investigate what are causing these spikes? I'd like to imagine there's some code path I forgot about that is surprisingly slow under load or maybe I'm missing a node flag that would resolve it but I don't know.