45

I want to get real-time updates about MongoDB database changes in Node.js.

A single MongoDB change stream sends update notifications almost instantly. But when I open multiple (10+) streams, there are massive delays (up to several minutes) between database writes and notification arrival.

That's how I set up a change stream:

let cursor = collection.watch([
  {$match: {"fullDocument.room": roomId}},
]);
cursor.stream().on("data", doc => {...});

I tried an alternative way to set up a stream, but it's just as slow:

let cursor = collection.aggregate([
  {$changeStream: {}},
  {$match: {"fullDocument.room": roomId}},
]);
cursor.forEach(doc => {...});

An automated process inserts tiny documents into the collection while collecting performance data.

Some additional details:

  • Open stream cursors count: 50
  • Write speed: 100 docs/second (batches of 10 using insertMany)
  • Runtime: 100 seconds
  • Average delay: 7.1 seconds
  • Largest delay: 205 seconds (not a typo, over three minutes)
  • MongoDB version: 3.6.2
  • Cluster setup #1: MongoDB Atlas M10 (3 replica set)
  • Cluster setup #2: DigitalOcean Ubuntu box + single instance mongo cluster in Docker
  • Node.js CPU usage: <1%

Both setups produce the same issue. What could be going on here?

aedm
  • 5,596
  • 2
  • 32
  • 36
  • Did you check if you have all the needed indexes? I.e. "fullDocument.room" i guess this needs an index – pinturic Jan 23 '18 at 23:01
  • No, I don't have any indexes. I don't really see how indexes would help sort out newly inserted items. But I'll give it a try. – aedm Jan 24 '18 at 00:01
  • Update: added an index on `room`, nothing changed. – aedm Jan 24 '18 at 00:08
  • Did you find any hint on this ? – pinturic Jan 25 '18 at 08:51
  • 1
    Unfortunately, no. :( I'll start a bounty. – aedm Jan 26 '18 at 00:55
  • RAM size on these machines? `It’s estimated that after 1000 streams you will start to see very measurable performance drops. Why there is not a global change stream option to avoid having so many cursors floating around is not clear. I think it’s something that should be looked at for future versions of this feature. Up to now, many use cases of mongo, specifically in the multi-tenant world, might have > 1000 namespaces on a system. This would make the performance drop problematic.` https://www.percona.com/blog/2017/11/22/mongodb-3-6-change-streams-nest-temperature-fan-control-use-case/ – Tarun Lalwani Jan 27 '18 at 07:42
  • My dev machine has 16G, and I see considerable performance drop even with just 10 open streams. The DO machine I tested has 4G as I recall. Both memory and CPU usages were pretty low though. – aedm Jan 27 '18 at 19:24
  • 1) How are you running the processes ? 2) Have you measured the network latency between your DigitalOcean box and Atlas cluster ? 3) Have you tried replicating with all nodes in a local network ? – Wan B. Jan 29 '18 at 01:28
  • @WanBachtiar: 1) I run a single Node.js script that creates 50 change stream cursors and then writes into the collection at 100 docs/sec. 2) I don't have exact numbers. But the delay is the same on my local computer without any internet traffic. Atlas servers have milliseconds-grade ping distance, and change streams are several orders of magnitude slower than that. The DO test case runs both Mongo and the client on the same machine, yet the issue persists. I highly doubt it's a connection issue. 3) Yes, I did, the result is the same. – aedm Jan 29 '18 at 11:04
  • I filed a bug to MongoDB if anyone's interested: https://jira.mongodb.org/browse/SERVER-32946 – aedm Jan 29 '18 at 11:04

1 Answers1

57

The default connection pool size in the Node.js client for MongoDB is 5. Since each change stream cursor opens a new connection, the connection pool needs to be at least as large as the number of cursors.

In version 3.x of the Node Mongo Driver use 'poolSize':

const mongoConnection = await MongoClient.connect(URL, {poolSize: 100});

In version 4.x of the Node Mongo Driver use 'minPoolSize' and 'maxPoolSize':

const mongoConnection = await MongoClient.connect(URL, {minPoolSize: 100, maxPoolSize: 1000});

(Thanks to MongoDB Inc. for investigating this issue.)

dax
  • 591
  • 2
  • 6
  • 16
aedm
  • 5,596
  • 2
  • 32
  • 36
  • Thanks for this. so if i am watching 5 collections, these different collections will also open a connection to the database ? – 0.sh Jan 28 '19 at 19:55
  • 1
    @0.sh Yes, that's my understanding. Worth mentioning, even if you watch the same collection using 5 different streams, it opens 5 connections. No idea why. – aedm Jan 29 '19 at 10:11
  • and the annoying fact is , if the poolSize is equal to `db.serverStatus().connections.current` the app will slow down drastically – 0.sh Jan 29 '19 at 10:18
  • 1
    This is very disappointing, but it is worth noting that at least on 4.2 and later you can open a change stream on the database and then use the query to select which collections you want; thus you can do a single stream for all collections in a database, you just have to demultiplex them in your own code – taxilian Sep 15 '20 at 19:21
  • Hi @taxilian, is any maximum number of change streams that can open? – Dat Tan Nguyen Nov 11 '20 at 10:21
  • if each change stream cursor opens a new connection then the maximum number would be related to the maximum number of connections and/or cursors which the db server (and/or client) can handle – taxilian Nov 12 '20 at 23:02
  • @0.sh I am experiencing this exact issue. When current connections are equal to the pool size the app slows down. How to solve this? – emorling Feb 28 '22 at 23:24