I have a rather complicated situation that I want to solve. I have a very dynamic model that allows to reference other objects (in the same collection). The reference may be a single value or an array of references. Also, the names are not known in advance (in this example i named them ref1 and ref2 but they may be anything). What identifies the reference object is that a field with key referenceId exists. I actually did resolve that part (see further below). Sample data would look like this (I chose strings instead of ObjectIds for the sake of simplicity):
[
{
"id": "1",
"name": "First"
},
{
"id": "2",
"name": "Second",
"ref1": {
"referenceId": "3"
},
"ref2": [
{
"referenceId": "1",
},
{
"referenceId": "3"
}
]
},
{
"id": "3"
"name": "Third",
}
]
The desired result would be (after the lookups have been performed and the resulting objects replaced inline) - here for the object with id = 2:
[
{
"id": "2",
"name": "Second",
"ref1": {
"id": "1",
"name": "First"
},
"ref2": [
{
"id": "1",
"name": "First"
},
{
"id": "3"
"name": "Third",
}
]
}
]
After a lot of experimenting I came up with this idea for an aggregation pipeline which produces the following output:
[
{
"id": "2",
"name": "Second",
"ref1": {
"id": "3",
"name": "Third"
},
"ref2": [
{
"referenceId": "1"
},
{
"referenceId": "3"
}
]
},
{
"id": "2",
"name": "Second",
"ref1": {
"referenceId": "3"
},
"ref2": {
"id": "1",
"name": "First"
}
},
{
"id": "2",
"name": "Second",
"ref1": {
"referenceId": "3"
},
"ref2": {
"id": "3",
"name": "Third"
}
}
]
This is pretty close to what I want, however, I am stuck at this stage with the following problems:
- how can I remove the unwanted fields? Because of the $unwinding for the $lookup, I have duplicates of fields which I don't want in my final result (ref1 where ref2 was looked up and ref2 where ref1)
- how do I group the result into one object by the dynamic (aka unpredictable) field names (in this case ref1 and ref2?)
Is this doable in MongoDB? Am I on the right path or is this something I should consider doing in client (as in backend) code? I would prefer to solve this in aggregation since client code will have a hit on performance.
Here are the steps I am taking up so far:
- Convert the result into an array via $objectToArray, and fetch only the keys I am interested in by checking for existence of the referenceId key. If present, return the [key, value] pair, null otherwise: (match for the 2nd object from data above)
{
$match: {
id: "2"
}
}
{
$addFields: {
"references": {
$map: {
input: {
"$objectToArray": "$$ROOT"
},
as: "item",
in: {
$cond: {
if: {
$ne: [
"$$item.v.referenceId",
undefined
]
},
then: [
"$$item.k",
"$$item.v.referenceId"
],
else: null
}
}
}
}
}
}
- Filter out the null values for a subsequent map:
{
"$addFields": {
"references": {
$filter: {
input: "$references",
cond: {
$ne: [
"$$this",
null
]
}
}
}
}
}
- For a lookup I need fixed field names, therefore convert the results into an array of {"k": , "v": "}
{
"$addFields": {
"references": {
$map: {
input: "$references",
as: "item",
in: {
k: {
"$arrayElemAt": [
"$$item",
0
]
},
v: {
"$arrayElemAt": [
"$$item",
1
]
}
}
}
}
}
}
- Unwind the results to have a v field to perform the lookup with:
{
$unwind: "$references"
},
{
$unwind: "$references.v"
}
- Perform the lookup into a "resolved" field, unwind it, and convert it to an object with the name of (~ ref1, ref2 in my example above) and add it to the result document.
{
$lookup: {
"from": "collection",
"localField": "references.v",
"foreignField": "id",
"as": "resolved"
}
},
{
$unwind: "$resolved"
},
{
"$addFields": {
"k": "$references.k",
"v": "$resolved"
}
},
and finally remove the temporary fields
{
$project: {
resolved: 0,
references: 0,
}
},
Anyone up for the challenge or am I just trying too hard? ;-)