I am trying to design an ETL pipeline between SQL, and MongoDB. The SQL server serves as a transactional database. New/Updated data from SQL is moved to MongoDB to provide faster validation for the software.
In MongoDB, is there anything similar to a MERGE operation in SQL? I haven't been able to find an example of bulk inserting/updating thousands of mongoDB documents based on newly loaded documents related by primary key. An example of the mongoDB data would look similar to the following with dbID being a primary key in SQL:
{
_id: 1,
dbID: 1
val1: xx,
val2: aa
},
{
_id: 2,
dbID: 2
val1: y,
val2: bb
},
{
_id: 3,
dbID: 3
val1: x,
val2: cc
},
{
_id: 4,
dbID: 4
val1: z,
val2: bb
}
In the same collection, I load any changes from the transactional database to be processed. These can consist of inserts and updates. How can I bulk overwrite/insert the existing documents with their new information?
{
_id: 5,
dbID: 1
val1: x,
val2: UpdatedValue
Type: 'ETL'
SubType: 'Update'
},
{
_id: 6,
dbID: 3
val1: x,
val2: UpdatedValue
Type: 'ETL'
SubType: 'Update'
},
{
_id: 7,
dbID: 7
val1: qwert,
val2: xyz
Type: "ETL"
SubType: "Insert"
}
In SQL, I would join on dbID, and update/insert anything that was changed. Is there anything like this in MongoDB? Currently my solution is to loop through all Type: "ETL" records one at a time to process the necessary changes into the existing mongoDB records. It'd be great if there was a way to bulk import changes instead of looping over thousands of documents.
The final result would look like this:
{
_id: 1,
dbID: 1
val1: xx,
val2: UpdatedValue
},
{
_id: 2,
dbID: 2
val1: y,
val2: bb
},
{
_id: 3,
dbID: 3
val1: x,
val2: UpdatedValue
},
{
_id: 4,
dbID: 4
val1: z,
val2: bb
},
{
_id: 7,
dbID: 7
val1: qwert,
val2: xyz
}
Thanks!