I see that Parse Server instances running on AWS EC2 show increasing response times. The longer an instance is in service, the longer the response time, even though the number of requests / instance stays the same.
In this article it say:
Imagine that a running instance of your server is only using 25/30% CPU before latency goes up. At that point, you have to deploy a new instance of the same server to maintain a low latency. Why? Because of the use of a single thread, incoming requests are quickly queued, especially if one the request takes time to build the response.
From this answer I understand that the commands process._getActiveHandles()
and process._getActiveRequests()
can be used to inspect the queue.
- Where specifically in the output of these commands do I see how many requests are in the queue, so I can see whether a backlog is building up?
- How can I tell whether the queue is overflowing, would that be in the Node.js logs?