1

I'm making an online browser game with websockets and a node server and if I have around 20-30 players, the CPU is usually around 2% and RAM at 10-15%. I'm just using a cheap Digital Ocean droplet to host it.

However, every 20-30 minutes it seems, the server CPU usage will spike to 100% for 10 seconds, and then finally crash. Up until that moment, the CPU usually hovering around 2% and the game is running very smoothly.

I can't tell for the life of me what is triggering this as there are no errors in the logs and nothing in the game that I can see causes it. Just seems to be a random event that brings the server down.

There are also some smaller spikes as well that don't bring the server down, but soon resolve themselves. Here's an image:

http://i.imgur.com/EH3o8ue.png

I don't think I'm blocking the event loop anywhere and I don't have any execution paths that seem to be long running. The packets to and from the server are usually two per second per user, so not much bandwidth used at all. And the server is mostly just a relay with little processing of packets other than validation so I'm not sure what code path could be so intensive.

What can I do to profile this and find out where to begin in how to investigate what are causing these spikes? I'd like to imagine there's some code path I forgot about that is surprisingly slow under load or maybe I'm missing a node flag that would resolve it but I don't know.

Lawrence Douglas
  • 667
  • 1
  • 7
  • 15

1 Answers1

0

I think I might have figured it out.

I'm using mostly websockets for my game and I was running htop and noticed that if someone sends large packets (performing a ton of actions in a short amount of time) then the CPU spikes to 100%. I was wondering why that was when I remembered I was using a binary-packer to reduce bandwidth usage.

I tried changing the parser to JSON instead so as to not compress and pack the packets and regardless of how large the packets were the CPU usage stayed at 2% the entire time.

So I think what was causing the crash was when one player would send a lot of data in a short amount of time and the server would be overwhelmed with having to pack all of it and send it out in time.

This may not be the actual answer but it's at least something that needs to be fixed. Thankfully the game uses very little bandwidth as it is and bandwidth is not the bottleneck so I may just leave it as JSON.

The only problem is that with JSON encoding that users can read the packets in the Chrome developer console network tab which I don't like.. Makes it a lot easier to find out how the game works and potentially find cheats/exploits..

Lawrence Douglas
  • 667
  • 1
  • 7
  • 15