116

When using MongoDB's $in clause, does the order of the returned documents always correspond to the order of the array argument?

styvane
  • 59,869
  • 19
  • 150
  • 156
user2066880
  • 4,825
  • 9
  • 38
  • 64

11 Answers11

93

As noted, the order of the arguments in the array of an $in clause does not reflect the order of how the documents are retrieved. That of course will be the natural order or by the selected index order as shown.

If you need to preserve this order, then you basically have two options.

So let's say that you were matching on the values of _id in your documents with an array that is going to be passed in to the $in as [ 4, 2, 8 ].

Approach using Aggregate


var list = [ 4, 2, 8 ];

db.collection.aggregate([

    // Match the selected documents by "_id"
    { "$match": {
        "_id": { "$in": [ 4, 2, 8 ] },
    },

    // Project a "weight" to each document
    { "$project": {
        "weight": { "$cond": [
            { "$eq": [ "$_id", 4  ] },
            1,
            { "$cond": [
                { "$eq": [ "$_id", 2 ] },
                2,
                3
            ]}
        ]}
    }},

    // Sort the results
    { "$sort": { "weight": 1 } }

])

So that would be the expanded form. What basically happens here is that just as the array of values is passed to $in you also construct a "nested" $cond statement to test the values and assign an appropriate weight. As that "weight" value reflects the order of the elements in the array, you can then pass that value to a sort stage in order to get your results in the required order.

Of course you actually "build" the pipeline statement in code, much like this:

var list = [ 4, 2, 8 ];

var stack = [];

for (var i = list.length - 1; i > 0; i--) {

    var rec = {
        "$cond": [
            { "$eq": [ "$_id", list[i-1] ] },
            i
        ]
    };

    if ( stack.length == 0 ) {
        rec["$cond"].push( i+1 );
    } else {
        var lval = stack.pop();
        rec["$cond"].push( lval );
    }

    stack.push( rec );

}

var pipeline = [
    { "$match": { "_id": { "$in": list } }},
    { "$project": { "weight": stack[0] }},
    { "$sort": { "weight": 1 } }
];

db.collection.aggregate( pipeline );

Approach using mapReduce


Of course if that all seems to hefty for your sensibilities then you can do the same thing using mapReduce, which looks simpler but will likely run somewhat slower.

var list = [ 4, 2, 8 ];

db.collection.mapReduce(
    function () {
        var order = inputs.indexOf(this._id);
        emit( order, { doc: this } );
    },
    function() {},
    { 
        "out": { "inline": 1 },
        "query": { "_id": { "$in": list } },
        "scope": { "inputs": list } ,
        "finalize": function (key, value) {
            return value.doc;
        }
    }
)

And that basically relies on the emitted "key" values being in the "index order" of how they occur in the input array.


So those essentially are your ways of maintaining the order of a an input list to an $in condition where you already have that list in a determined order.

Robert Rossmann
  • 11,931
  • 4
  • 42
  • 73
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
  • 2
    Great answer. For those who need it, a coffeescript version [here](https://gist.github.com/e1ff8a85e32af6ff9a0e) – Lawrence Jones Jun 25 '14 at 18:35
  • Neil, Does the $in clause list the records in the same order? – 2plus Aug 09 '14 at 14:25
  • @2plus No it does not which is what is said at the start of the answer. The point is that if you want a specified order that is not already part of the document then you must "project" a field to order on. This shows a few approaches to ordering by the arguments sent to $in, but the same applies to other similar situations. – Neil Lunn Aug 10 '14 at 03:01
  • 1
    @NeilLunn I tried the approach using aggregate, but I get the id's and the weight. Do you know how to retrieve the posts (object)? – Juanjo Lainez Reche Dec 17 '14 at 13:40
  • @JuanjoLainezReche You probably need to post this as a question on StackOverlow rather than just comment here. If you put your full case as a posted question then we can all see what you are really truing to do and what might be failing for you. – Neil Lunn Dec 17 '14 at 13:43
  • 1
    @NeilLunn I did actually (it's here http://stackoverflow.com/questions/27525235/order-of-results-in-mongodb-with-in-like-mysql-field-id ) But the only comment was referring here, even though I checked this before posting my question. Can you help me there? Thank you! – Juanjo Lainez Reche Dec 17 '14 at 15:20
  • 1
    know this is old, but I wasted a lot of time debugging why inputs.indexOf() was not matching with this._id. If you're just returning the value of the object Id, you may have to opt for this syntax : obj.map = function() { for(var i = 0; i < inputs.length; i++){ if(this._id.equals(inputs[i])) { var order = i; } } emit(order, {doc: this}); }; – NoobSter Aug 09 '16 at 19:40
  • 2
    you can use "$addFields" instead of "$project" if you want to have all the original fields too – Jodo Oct 28 '17 at 19:57
  • @NeilLunn how do the Aggregate approach in performance wise if the array items are object ids? – wdetac Mar 19 '22 at 16:17
59

Another way using the Aggregation query only applicable for MongoDB verion >= 3.4 -

The credit goes to this nice blog post.

Example documents to be fetched in this order -

var order = [ "David", "Charlie", "Tess" ];

The query -

var query = [
             {$match: {name: {$in: order}}},
             {$addFields: {"__order": {$indexOfArray: [order, "$name" ]}}},
             {$sort: {"__order": 1}}
            ];

var result = db.users.aggregate(query);

Another quote from the post explaining these aggregation operators used -

The "$addFields" stage is new in 3.4 and it allows you to "$project" new fields to existing documents without knowing all the other existing fields. The new "$indexOfArray" expression returns position of particular element in a given array.

Basically the addFields operator appends a new order field to every document when it finds it and this order field represents the original order of our array we provided. Then we simply sort the documents based on this field.

Jyotman Singh
  • 10,792
  • 8
  • 39
  • 55
33

If you don't want to use aggregate, another solution is to use find and then sort the doc results client-side using array#sort:

If the $in values are primitive types like numbers you can use an approach like:

var ids = [4, 2, 8, 1, 9, 3, 5, 6];
MyModel.find({ _id: { $in: ids } }).exec(function(err, docs) {
    docs.sort(function(a, b) {
        // Sort docs by the order of their _id values in ids.
        return ids.indexOf(a._id) - ids.indexOf(b._id);
    });
});

If the $in values are non-primitive types like ObjectIds, another approach is required as indexOf compares by reference in that case.

If you're using Node.js 4.x+, you can use Array#findIndex and ObjectID#equals to handle this by changing the sort function to:

docs.sort((a, b) => ids.findIndex(id => a._id.equals(id)) - 
                    ids.findIndex(id => b._id.equals(id)));

Or with any Node.js version, with underscore/lodash's findIndex:

docs.sort(function (a, b) {
    return _.findIndex(ids, function (id) { return a._id.equals(id); }) -
           _.findIndex(ids, function (id) { return b._id.equals(id); });
});
JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
  • how does the equal function know to compare a id property to id 'return a.equals(id);', cause a holds all the properties returned for that model? – lboyel May 22 '16 at 15:19
  • 1
    @lboyel I didn't mean it to be that clever :-), but that worked because it was using Mongoose's [`Document#equals`](http://mongoosejs.com/docs/api.html#document_Document-equals) to compare against the doc's `_id` field. Updated to make the `_id` comparison explicit. Thanks for asking. – JohnnyHK May 22 '16 at 17:59
8

An easy way to order the result after mongo returns the array is to make an object with id as keys and then map over the given _id's to return an array that is correctly ordered.

async function batchUsers(Users, keys) {
  const unorderedUsers = await Users.find({_id: {$in: keys}}).toArray()
  let obj = {}
  unorderedUsers.forEach(x => obj[x._id]=x)
  const ordered = keys.map(key => obj[key])
  return ordered
}
Arne Jenssen
  • 1,235
  • 3
  • 14
  • 22
  • 1
    This does exactly what I need and is much simpler than the top comment. – dyarbrough Jun 19 '19 at 21:34
  • @dyarbrough this solution only works for queries that fetch all the documents (without limit or skip). The top comment is more complex but works for every scenario. – marian2js Aug 29 '20 at 21:36
6

Similar to JonnyHK's solution, you can reorder the documents returned from find in your client (if your client is in JavaScript) with a combination of map and the Array.prototype.find function in EcmaScript 2015:

Collection.find({ _id: { $in: idArray } }).toArray(function(err, res) {

    var orderedResults = idArray.map(function(id) {
        return res.find(function(document) {
            return document._id.equals(id);
        });
    });

});

A couple of notes:

  • The above code is using the Mongo Node driver and not Mongoose
  • The idArray is an array of ObjectId
  • I haven't tested the performance of this method vs the sort, but if you need to manipulate each returned item (which is pretty common) you can do it in the map callback to simplify your code.
Community
  • 1
  • 1
tebs1200
  • 1,205
  • 12
  • 19
  • Running time is O(n*n), as the inner `find` traverses the array for each element of the array (from the outer `map`). This is horribly inefficient, as there is an O(n) solution using a lookup table. – curran Dec 20 '20 at 12:30
5

I know this question is related to Mongoose JS framework, but the duplicated one is generic, so I hope posting a Python (PyMongo) solution is fine here.

things = list(db.things.find({'_id': {'$in': id_array}}))
things.sort(key=lambda thing: id_array.index(thing['_id']))
# things are now sorted according to id_array order
Community
  • 1
  • 1
Dennis Golomazov
  • 16,269
  • 5
  • 73
  • 81
3

Always? Never. The order is always the same: undefined (probably the physical order in which documents are stored). Unless you sort it.

freakish
  • 54,167
  • 9
  • 132
  • 169
3

For any newcomers here is an short and elegant solution to preserve the order in such cases as of 2021 and using MongoDb 3.6 (tested):

  const idList = ['123', '124', '125']
  const out = await db
    .collection('YourCollection')
    .aggregate([
      // Change uuid to your `id` field
      { $match: { uuid: { $in: idList } } },
      {
        $project: {
          uuid: 1,
          date: 1,
          someOtherFieldToPreserve: 1,
          // Addding this new field called index
          index: {
            // If we want index to start from 1, add an dummy value to the beggining of the idList array
            $indexOfArray: [[0, ...idList], '$uuid'],
            // Otherwise if 0,1,2 is fine just use this line
            // $indexOfArray: [idList, '$uuid'],
          },
        },
      },
      // And finally sort the output by our index
      { $sort: { index: 1 } },
    ])
Goran.it
  • 5,991
  • 2
  • 23
  • 25
  • Great! Thanks. Also note that, for some reason, there must be some other fields to project in the `$project` operator, I mean, you can't just project the order. – David Corral Oct 31 '21 at 11:11
1

I know this is an old thread, but if you're just returning the value of the Id in the array, you may have to opt for this syntax. As I could not seem to get indexOf value to match with a mongo ObjectId format.

  obj.map = function() {
    for(var i = 0; i < inputs.length; i++){
      if(this._id.equals(inputs[i])) {
        var order = i;
      }
    }
    emit(order, {doc: this});
  };

How to convert mongo ObjectId .toString without including 'ObjectId()' wrapper -- just the Value?

Community
  • 1
  • 1
NoobSter
  • 1,150
  • 1
  • 16
  • 39
0

You can guarantee order with $or clause.

So use $or: [ _ids.map(_id => ({_id}))] instead.

fakenickels
  • 171
  • 6
  • 2
    The `$or` workaround hasn't worked [since v2.6](https://jira.mongodb.org/browse/SERVER-7528?focusedCommentId=601476&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-601476). – JohnnyHK Jun 14 '16 at 14:38
0

This is a code solution after the results are retrieved from Mongo. Using a map to store index and then swapping values.

catDetails := make([]CategoryDetail, 0)
err = sess.DB(mdb).C("category").
    Find(bson.M{
    "_id":       bson.M{"$in": path},
    "is_active": 1,
    "name":      bson.M{"$ne": ""},
    "url.path":  bson.M{"$exists": true, "$ne": ""},
}).
    Select(
    bson.M{
        "is_active": 1,
        "name":      1,
        "url.path":  1,
    }).All(&catDetails)

if err != nil{
    return 
}
categoryOrderMap := make(map[int]int)

for index, v := range catDetails {
    categoryOrderMap[v.Id] = index
}

counter := 0
for i := 0; counter < len(categoryOrderMap); i++ {
    if catId := int(path[i].(float64)); catId > 0 {
        fmt.Println("cat", catId)
        if swapIndex, exists := categoryOrderMap[catId]; exists {
            if counter != swapIndex {
                catDetails[swapIndex], catDetails[counter] = catDetails[counter], catDetails[swapIndex]
                categoryOrderMap[catId] = counter
                categoryOrderMap[catDetails[swapIndex].Id] = swapIndex
            }
            counter++
        }
    }
}
Prateek
  • 6,644
  • 6
  • 22
  • 26