1

I have a query as below, what it does it creates a link between two documents and find the last order date and users details like email, phone, etc. but on large data set it shows me timeout error any help would be much appreciated, and thanks in advance for the help

db.users.aggregate([
  {
    "$lookup": {
      "from": "orders",
      "let": {
        "id": "$_id"
      },
      "pipeline": [
        {
          "$addFields": {
            "owner": {
              "$toObjectId": "$owner"
            }
          }
        },
        {
          "$match": {
            $expr: {
              $eq: [
                "$owner",
                "$$id"
              ]
            }
          }
        },
        
      ],
      "as": "orders"
    }
  },
  {
    "$unwind": {
      path: "$orders",
      preserveNullAndEmptyArrays: false,
      includeArrayIndex: "arrayIndex"
    }
  },
  {
    "$group": {
      "_id": "$_id",
      "order": {
        "$last": "$orders.createdAt"
        
      },
      "userInfo": {
        "$mergeObjects": {
          name: "$name",
          email: "$email",
          phone: "$phone",
          orderCount: "$orderCount"
        }
      }
    }
  },
  {
    "$project": {
      name: "$userInfo.name",
      email: "$userInfo.email",
      phone: "$userInfo.phone",
      orderCount: "$userInfo.orderCount",
      lastOrder: "$order",
      
    }
  }
]
)

my documents look like the following for orders

{
      "_id": ObjectId("607fbeeb0a752a66a7af40eb"),
      "address": {
        "loc": [
          -1,
          3
        ],
        "_id": "5d35d55d3d081f486d0d401c",
        "apartment": "",
        "description": "ACcdg dfef"
      },
      "approvedAt": ISODate("2021-04-21T11:28:05.295+05:30"),
      "assignedAt": null,
      "billingAddress": {
        "description": ""
      },
      "createdAt": ISODate("2021-04-21T11:28:04.449+05:30"),
      "creditCard": "",
      "deliveryDate": "04/21/21",
      "deliveryDateObj": ISODate("2021-04-21T12:27:58.746+05:30"),
      "owner": "609bd5831b912947ea51a9ac",
      "products": [
        "5a070c079b"
      ],
      "updatedAt": ISODate("2021-04-21T11:28:05.295+05:30"),
      
    }

and for users, it is like below

{
      "_id": ObjectId("609bd5831b912947ea51a9ac"),
      "updatedAt": ISODate("2021-05-12T18:47:55.291+05:30"),
      "createdAt": ISODate("2021-05-12T18:47:55.213+05:30"),
      "email": "1012@gmail.com",
      "phone": "123",
      "dob": "1996-04-10",
      "password": "",
      "stripeID": "",
      "__t": "Customer",
      "name": {
        "first": "A",
        "last": "b"
      },
      "orderCount": 1,
      "__v": 0,
      "forgottenPassword": ""
    }
turivishal
  • 34,368
  • 7
  • 36
  • 59
Avinash jain
  • 486
  • 1
  • 7
  • 15

2 Answers2

1
  • convert _id to string in lookup's let and you can remove $addFields from lookup pipeline
  • add $project stage in lookup pipeline and show only required fields
  • $project to show required fields and get last / max createdAt date use $max, you don't need to $unwind and $group operation
db.users.aggregate([
  {
    $lookup: {
      from: "orders",
      let: { id: { $toString: "$_id" } },
      pipeline: [
        { $match: { $expr: { $eq: ["$owner", "$$id"] } } },
        {
          $project: {
            _id: 0,
            createdAt: 1
          }
        }
      ],
      "as": "orders"
    }
  },
  {
    $project: {
      email: 1,
      name: 1,
      orderCount: { $size: "$orders" },
      phone: 1,
      lastOrder: { $max: "$orders.createdAt" }
    }
  }
])

Playground

SUGGESTION:

  • You can save owner id in orders as objectId instead of string and whenever new order arrive store it as objectId, you can prevent conversation operator $toString operation
  • create an index in owner field to make lookup process faster.
turivishal
  • 34,368
  • 7
  • 36
  • 59
  • still while exporting the data time-out is coming, any way I can get the whole data out in Jason format – Avinash jain May 28 '21 at 08:32
  • how you are exporting data? – turivishal May 28 '21 at 08:43
  • using studio 3t feature to export data @turivishal – Avinash jain May 28 '21 at 08:45
  • use mongo shell for export collection. see this [answer](https://stackoverflow.com/a/25936760/8987128), first $out this query result in a collection and export it by command from mongo shell. – turivishal May 28 '21 at 08:48
  • is there is any way around to optimize the query more as I don't want to add any thing to the collection as it it the production database, anything I can read and apply to optimize it would be a great help and thanks – Avinash jain May 28 '21 at 08:58
  • i think there is no issue in query, if you are not able to export data then the problem in studio 3t execution time limit, I think there will be an option to increase execution time I am not sure. – turivishal May 28 '21 at 09:01
  • i have tried changing the time but it didn't help – Avinash jain May 28 '21 at 12:30
  • okay so i think there is huge data, use mongo shell for export. there is no connection between export daata and query. – turivishal May 28 '21 at 12:34
0

I have figured out that after using createIndex for the owner field which is used to compare the owner in the orders from the users _id filed, so just after adding an db.orders.createIndex({ owner: 1 }), the query will run much faster and smoother

Avinash jain
  • 486
  • 1
  • 7
  • 15