I have a requirement where I need to parse JSON objects from a text file and persist them to MongoDB..
SOME DETAILS -
- File Size ~ 1-10 MB, #json objects ~ 100 k, size of a single json object is therefore quite small..
- Mongodb cluster (sharded and replicated)
- Performance - Time is at a premium..
- I cannot dump any object to my mongodb collection unless I parse and validate the whole file..
- My app uses J2EE stack (Spring 3.2)
So now I have a million Java objects that I need to store before doing bulk insert to mongodb..(mongodb is sharded.. so i have to pre split for better performance etc)
My question is how do I make this efficient? Some of the ways I thought of -
- Serialize and store objects to file. (Problem: IO time)
- Make a temp collection on a standalone non sharded mongo and then bulk insert to the required collection (Looks better than #1).
Can anyone share her experience for a similar problem..? Do let me know if any other info is needed..