1

Looking for way to directly query a MongoDB index, instead of the collection that the index belongs to.

For example, here are two collections:

User:

{
id: ObjectId(),
username: string
...
}

SpecialUser:

{
id: ObjectId(),
specialUsername: string
...
}

Now, let's assume that we have a unique index on the SpecialUser.specialUsername field.

We want to find the id(s) of all User documents which have their username appear in the SpecialUser.specialUsername field.

Now, we can easily do this with the following query:

db.User.aggregate([ 
  { $lookup: {from: "SpecialUser", localFieldName: "username", foreignField: "specialUsername", as "spUser"} }, 
  { $unwind: { path: "$spUser", preserveNullAndEmptyArrays: false} }, 
  { $project: {_id: 1}}
])

However, in this particular case, we are using a lookup and therefore fetching the SpecialUser collection although we don't need any information from that particular collection.

Is it possible to directly use the unique index instead of the $lookup operation for this particular case?

Dimitar Spasovski
  • 2,023
  • 9
  • 29
  • 45
  • *we don't need any information from that particular collection* - I doubt so. In your use case, the information of **existence** of certain special username is stored in the collection `SpecialUser`. I am not sure if you can skip the `$lookup` in this case. – ray Mar 15 '23 at 19:36
  • @ray but isn’t that information also stored in the index itself? – Dimitar Spasovski Mar 15 '23 at 21:52
  • Not sure about that. But if this is a frequent use case, you may want to put the 2 collections together for easier information retrieval – ray Mar 16 '23 at 17:47
  • see https://stackoverflow.com/questions/31633871/how-to-use-hint-in-mongodb-aggregation-query, maybe it's what you're looking for – dododo Mar 23 '23 at 13:30
  • The aggregation doesn't make much sense tbh. You lookup, then unwind (why?) an array with at most 1 item, then discard everything but _id from User collection without any filtering. You can get the same result by just `db.User.find({}, {_id})` which will use covered query and return data straight from User._id index. – Alex Blex Mar 24 '23 at 10:56
  • 1
    @AlexBlex The `unwind` operation will filter out the documents that have a username which is not present in the `SpecialUser` collection. Your query would just return all users. – Dimitar Spasovski Mar 24 '23 at 12:14
  • ah, indeed, I missed the preserveNullAndEmptyArrays part – Alex Blex Mar 25 '23 at 03:11

2 Answers2

0

Is it possible to directly use the unique index instead of the $lookup operation for this particular case?

How do you know this unique index is not used by it? You can try "explain" and check the output.

Given the lookup query is only using the indexed filed, i suppose (need verify of course) that it's a covered query, which means only index tree will be examined, and no disk fetch is done for that stage.

mongodb - $lookup pipeline using COLLSCAN instead of index This post may be related

suitianshi
  • 3,300
  • 1
  • 17
  • 34
0

n MongoDB, you cannot directly query an index without querying the collection it belongs to. Indexes are used internally by MongoDB to optimize query performance by providing a more efficient way to search through the documents in a collection. They are not separate entities that can be queried directly.

However, if you want to optimize your query further and reduce the amount of data fetched from the SpecialUser collection, you can use the $project stage in the aggregation pipeline to limit the fields returned from the SpecialUser collection. In this case, you only need the _id field from the SpecialUser collection.

Here's an example of how to do this:

db.User.aggregate([
  {
    $lookup: {
      from: "SpecialUser",
      localField: "username",
      foreignField: "specialUsername",
      as: "spUser",
    },
  },
  { $unwind: { path: "$spUser", preserveNullAndEmptyArrays: false } },
  {
    $project: {
      _id: 1,
      spUser: { _id: 1 }, // Limit the fields returned from the SpecialUser collection
    },
  },
]);

By projecting only the necessary fields, you can minimize the amount of data transferred between the User and SpecialUser collections, potentially improving the performance of your query. However, it's essential to understand that this optimization still involves a $lookup operation and accesses both collections.