I maintain a few meteor 1.7.0.3 apps deployed on AWS elastic-beanstalk (on 64bit Amazon-Linux/4.5.2). All connect to a managed mongodb deployment (on Compose, mongodb version 3.2.18). Recently the mongodb deployment metrics have gone crazy. The deployment is saturated with thousands of connections and more incoming requests. When the number of connections hit 5000, the deployment's routers run out of memory and connections are getting refused.
Needless to say, this is causing the apps to be unresponsive/unusable!
Interesting points/details:
It happens on multiple meteor apps, all have different code-bases with little code in common. The common-code is the Collections-declaring code, such as:
const Accounts = new Mongo.Collection('account');
ssh'ing into the ec2, one can see that the node server holds multiple connections to the db:
sudo lsof -p $(ps awx | grep node | grep main.js | cut -f 2 -d " ") | egrep 'TCP|UDP'
result:
node 3749 nodejs 1486u IPv4 237550 0t0 TCP ip-172-31-60-74.ec2.internal:48210->ec2-54-174-178-28.compute-1.amazonaws.com:17847 (ESTABLISHED)
node 3749 nodejs 1487u IPv4 237694 0t0 TCP ip-172-31-60-74.ec2.internal:48306->ec2-54-174-178-28.compute-1.amazonaws.com:17847 (ESTABLISHED)
node 3749 nodejs 1488u IPv4 237854 0t0 TCP ip-172-31-60-74.ec2.internal:38462->ec2-54-84-155-20.compute-1.amazonaws.com:17847 (ESTABLISHED)
node 3749 nodejs 1489u IPv4 238043 0t0 TCP ip-172-31-60-74.ec2.internal:48506->ec2-54-174-178-28.compute-1.amazonaws.com:17847 (ESTABLISHED)
...
In total, there are thousands of lines. Limit seems to be ~5000. At that number the connections are getting rejected by the mongodb deployment.
The constant creation of new connections causes the node.js process cpu is higher than normal and it has an ever-increasing memory footprint.
If node is restarted, the number of connections gets back to a handful, and can stay there for hours.
Not sure how this is getting triggered, but at some point one of the servers begins creating new connections and all hell breaks loose.
The meteor server logs are empty. There's nothing which indicates there's an issue
When the offending meteor app is restarted, the mongodb deployment metrics show a significant drop in open connections and router memory load
Questions is, what can I check? Any advice what tools/techniques best use to investigate? (perhaps https://github.com/meteorhacks/kadira will be useful?)
Or even better, maybe someone else bumped into this and have a solution?
Aug 30th update:
rcvd a note from Compose:
For some users the connection count issues appears to be caused by a bug in the node native driver which has been fixed with the
v3.1.3
release (it is only about 2 weeks old).
Can anyone tell how to update a Meteor project to use the newer version of the mongodb driver?