2

There is UNION command in relational databases, e.g. in Microsoft SQL.
For Microsoft SQL, UNION states

Combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the union. The UNION operation is different from using joins that combine columns from two tables.

Is there any analog of UNION in MongoDB.

The task I'd like to solve is to union documents from two collections - Accounts and Packages.

Accounts

{ "_id": 1, "accountId": 1, "accOld": "oldValue1", "accNew": "newValue1" },  
{ "_id": 2, "accountId": 1, "accOld": "oldValue2", "accNew": "newValue2" },  
{ "_id": 3, "accountId": 2, "accOld": "oldValue1", "accNew": "newValue1" }

Packages

{ "_id": 1, "accountId": 1, "pckgOld": "packageOldValue1", "pckgNew": "packageNewValue1" },  
{ "_id": 2, "accountId": 1, "pckgOld": "packageOldValue2", "pckgOld": "packageNewValue2" },  
{ "_id": 3, "accountId": 2, "pckgOld": "packageOldValue3", "pckgOld": "packageNewValue4" },  
{ "_id": 4, "accountId": 3, "pckgOld": "packageOldValue4", "pckgOld": "packageNewValue4" }

As the result "row-set" I'd like to have the following one sorted by logically grouped by accountId field

{ "accountId": 1, "accOld": "oldValue1", "accNew": "newValue1" },  
{ "accountId": 1, "accOld": "oldValue2", "accNew": "newValue2" },  
{ "accountId": 1, "pckgOld": "packageOldValue1", "pckgNew": "packageNewValue1" },  
{ "accountId": 1, "pckgOld": "packageOldValue2", "pckgOld": "packageNewValue2" },  
{ "accountId": 2, "accOld": "oldValue1", "accNew": "newValue1" }
{ "accountId": 2, "pckgOld": "packageOldValue3", "pckgOld": "packageNewValue4" },  
{ "accountId": 3, "pckgOld": "packageOldValue4", "pckgOld": "packageNewValue4" }

P.S.: I've read the following questions, but haven't found it helpful
MongoDB: Combine data from multiple collections into one..how?
Merging two collections in MongoDB
MongoDB and “joins” [duplicate]

Lesha Pipiev
  • 3,251
  • 4
  • 31
  • 65

2 Answers2

1

There is a $setUnion operator available but you have to prepare the data for it, however it is possible I think:

db.Accounts.aggregate([
    { 
        $lookup: {
            from: "Packages",
            pipeline: [],
            as: "packages"
        }
    },
    {
        $addFields: {
            packages: {
              $map: {
                   input: "$packages",
                   as: "package",
                   in: { 
                    "accountId": "$$package.accountId", 
                    "pckgOld": "$$package.pckgOld", 
                    "pckgNew": "$$package.pckgNew", 
                   }
                }
            }
        }
    },
    {
        $group: {
            _id: null,
            accounts: {
                $push: {
                    accountId: "$accountId",
                    accOld: "$accOld",
                    accNew: "$accNew"
                }
            },
            packages: {
                $first: "$packages"
            }
        }
    },
    {
        $project: {
            items: {
                $setUnion: ["$accounts", "$packages"]
            }
        }
    },
    {
        $unwind: "$items"
    },
    {
        $replaceRoot: {
            newRoot: "$items"
        }
    }
])

Basically $lookup is used here to merge two collections. We don't specify matching fields which means that every Account will have all Products as an embedded array. Then you can use $map to get rid of unnecessary properties. Next, after $group we end up having two arrays of desired format embedded in one document. That's the moment where we can use $setUnion to merge two arrays. Last two steps ($unwind and $replaceRoot are just to get a list of documents as you wish).

mickl
  • 48,568
  • 9
  • 60
  • 89
  • Hi @micki. Thanks a lot for your help. Did you have a chance to check your query? I tried it but even part of it works unexpectedly. I left only `lookup` and `addFields` section and this query outputs only list of accounts with zero length `packages` field. – Lesha Pipiev Feb 19 '18 at 20:51
  • @LeshaPipiev, yes, I executed it on your sample data and it returned exactly what you expect. Could you check your MongoDB version ? I think you need 3.6. Second thing: is your collection named exactly "Packages" ? – mickl Feb 19 '18 at 20:56
  • Hi @micki. Thanks again for your help. Don't you think that this approach is poor in terms for scaling. Just imagine there will be 5000 of accounts and 30000 of packages. – Lesha Pipiev Feb 20 '18 at 09:52
  • 1
    @LeshaPipiev I agree. What I showed here is just a proof of concept which works for small sets but would definately be an overkill if you want to merge large collections. Currently we don't have better alternative in MongoDB for unions and all you can do is to fetch both collections in two round-trips and merge the results in memory. – mickl Feb 20 '18 at 10:25
1

Since MongoDB 4.4 there is now $unionWith aggregation pipeline command, allowing you to do simply:

db.accounts.aggregate([
   { $unionWith: { coll: "packages" } }
])
Mitar
  • 6,756
  • 5
  • 54
  • 86