1

My express application needs to list the number of users I currently have in my mongo database. That works fine when I ask for a small number of users (i.e. ~100), but if I ask for numbers in the tens of thousands I begin to see a huge delay in the response time. For instance, it takes about 19 seconds to retrieve 25,000 users.

Is it normal to have this kind of performance in mongo, or should I suspect I have a bug somewhere?

Here's my model:

'use strict';

// External modules
const mongoose = require('mongoose');
const encrypt = require('mongoose-encryption');

const Schema = mongoose.Schema;

const UserSchema = new Schema({
        uuid: {
            type: String,
            trim: true,
            // TODO: test for unique uuid
            index: { unique: true },
            required: 'uuid cannot be blank'
        },
        createdOn: {
            type: Date,
            default: Date.now,
            index: true
        },
        email: {
            type: String,
            trim: true,
            required: 'email cannot be blank'
        },
        lists: [{
            type: Schema.Types.ObjectId,
            ref: 'List'
        }]
    });

UserSchema.index({ "lists" : 1 , "createdOn" : 1});
UserSchema.plugin(encrypt, { encryptionKey: process.env.EMAIL_ENCRYPTION_KEY, signingKey: process.env.EMAIL_SIGNING_KEY, encryptedFields: ['email'] });

mongoose.model('User', UserSchema);

And here's my query:

exports.list = (req, res) => {

    let page = (Number(req.query.p) > 0 ? Number(req.query.p) : 1) - 1;
    let perPage = (Number(req.query.pp) > 0 ? Number(req.query.pp) : 100);

    res.header('X-Page', page + 1);
    res.header('X-Per-Page', perPage);

    User.count({}, (countErr, count) => {

        res.header('X-Total-Count', count);

        User.find({}, { __v: 0, createdOn: 0})
            .sort({'createdOn': 1})
            .limit(perPage)
            .skip(perPage * page)
            .exec((err, users) => {
                if (err) {
                    return res.status(400).send({
                        message: err
                    });
                }
                else {
                    res.json(users);
                }
            });

    });

};

When profiling, the count() operation takes about 52ms, hence, most of the time is spent in the find() query.

syymza
  • 679
  • 1
  • 8
  • 23
  • Read this one http://stackoverflow.com/questions/19559405/mongodb-querying-performance-for-over-5-million-records – mfrachet Jun 09 '15 at 14:25
  • @Skahrz thanks for pointing that out. I think my query is way simpler (plus I have created the { "lists" : 1 , "createdOn" : 1} index, which is the only one used in the query AFAIK. – syymza Jun 09 '15 at 14:30
  • What about the machine you re running mongo on ? – mfrachet Jun 09 '15 at 14:38
  • @Skahrz the M1 cluster on Amazon by Mongolab (https://mongolab.com/plans/pricing/): RAM: 1.7GB, SSD block storage 40 GB – syymza Jun 09 '15 at 14:40
  • You're creating a compound index with `lists` as [prefix](http://docs.mongodb.org/manual/core/index-compound/#prefixes), but you're only sorting on `createdOn` (ie not part of the prefix). My guess is that your query isn't using the index for sorting. – robertklep Jun 09 '15 at 14:47
  • @robertklep I have an index for the `createdOn` field too (specified when defining the property). I am using the compound index in totally separate query. My understanding is that it should not be an issue. – syymza Jun 09 '15 at 14:51
  • @syymza you're correct, I missed the index on the property, so that should work. Have you tried running your query without `.skip()` and `.limit()` to see what the performance is in that situation? My experience with those operators is pretty bad. – robertklep Jun 09 '15 at 14:54
  • @robertklep I have just tried. But, well, things get worse without a `limit()` since the number of records to retrieve is about 5 millions. At the same time, if i set the limit to a small number, i get the results in a few milliseconds. – syymza Jun 09 '15 at 14:58
  • 2
    @syymza yeah 5M is a bit much. Have you tried running the query without Mongoose at all (to rule out that Mongoose is the bottleneck)? Also, try using [`lean`](http://mongoosejs.com/docs/api.html#query_Query-lean) and/or [`stream`](http://mongoosejs.com/docs/api.html#query_Query-stream). – robertklep Jun 09 '15 at 14:59
  • 1
    @robertklep Wow, using `lean` improved performance by one order of magnitude! – syymza Jun 09 '15 at 15:20
  • @robertklep at the same time, I seem to have lost the possibility to use `mongoose-encryption` now that I use `lean` – syymza Jun 09 '15 at 15:30
  • @syymza yeah, `lean()` means that Mongoose won't convert the documents to model instances but keeps them as plain JS objects. – robertklep Jun 09 '15 at 19:19

0 Answers0