0

I'm given a data source monthly that I'm parsing and putting into a MongoDB database. Each month, some of the data will be updated and some new entries will be added to the existing collections. The source file is a few gigabytes big. Apart from these monthly updates, the data will not change at all.

Eventually, this database will be live and I want to prevent having any downtime during these monthly updates if possible. What is the best way to update my database without any downtime?


This question is basically exactly what I'm asking, but not for a MongoDB database. The accepted answer there is to upload a new version of the database and then rename the new database to use the old one's name. However, according to this question, it is impossible to easily rename a MongoDB database. This renders that approach unusable.

Intuitively, I would try to iteratively 'upsert' the entire database using each document's unique 'gid' identifier (this is a property of the data, as opposed to the "_id" generated by MongoDB) as a filter, but this might be an inefficient way of doing things.

I'm running MongoDB version 4.2.1

1 Answers1

0

Why do you think updating the data would mean downtime?

It sounds like you don't want your users to be able to access the new data mid-load.

If this is the case, a strategy could be to have 2 databases; a live and a staging; rather than renaming the staging database to live, you could just rename the connection string in the client application(s) that connect to it.

Also consider mongodump and mongorestore to copy databases; although these can be slower with larger databases.

Belly Buster
  • 8,224
  • 2
  • 7
  • 20