We have a distributed /shared mongo db (5 to 6 replica). I want to back up huge tables data if older than 1 year old (ie). And delete after backup. (I dont want any data loss, so i dont want to delete a collection completely) Source collection is customer (ie) Target collection is customer_archive
Note: Archived data should not be lost and should not have duplication same data in archived data . I assumed we need merge/upser mechanism. But dont know best way to do.
- what is the most efficient way to archive over 200-300 gb data (ie) (milions data everyday i mean growing fast, so querying getting difficult and performans )
- Archive collection might be in different database, how to write script?
- How can i schedule that script monthly/daily?
- How woud you do that if backup will be different database (i thought first export and move to another db)
My perspective was, first taking dump collection different name. And copy and delete yearly data by bulk Insert /bulk delete (read each 10.000 foreach commit). Some do not recommend foreach for efficiency but i assume the only way for partial deletion. I dont know if i can make a batch file ,so i can share it with our BT team for as task scheduling.