1

i try to import a big csv file into mongodb... i parsed my file with commons csv, then i insert it into my db. My problem is i run the code and it take more then 2 days and insert just 420320 rows (my file has 7 millions rows)!

here is my code:

for (CSVRecord record : csvFileParser) {
            HashMap<String, String> doc = new HashMap<String, String>();
            doc.put(OCEAN_VOGID, record.get(OCEAN_VOGID));
            doc.put(OCEAN_CELLIDCIBLE, record.get(OCEAN_CELLIDCIBLE));
            doc.put(OCEAN_CELLIDSOURCE, record.get(OCEAN_CELLIDSOURCE));
            doc.put(OCEAN_ESWID, record.get(OCEAN_ESWID));
            doc.put(OCEAN_VOGCOMMENT, record.get(OCEAN_VOGCOMMENT));
            doc.put(OCEAN_VOGFLGSUP, record.get(OCEAN_VOGFLGSUP));
            doc.put(OCEAN_VOGNUMDI, record.get(OCEAN_VOGNUMDI));
            doc.put(OCEAN_VOGQUI, record.get(OCEAN_VOGQUI));
            doc.put(OCEAN_VOGQUAND, record.get(OCEAN_VOGQUAND));
            doc.put(OCEAN_VOGVERSION, record.get(OCEAN_VOGVERSION));
            doc.put(OCEAN_MODEID, record.get(OCEAN_MODEID));
            BasicDBObject document = new BasicDBObject();
            document.putAll(doc);
            table.insert(document);
            BasicDBObject searchQuery = new BasicDBObject();
            searchQuery.putAll(doc);
            DBCursor cursor = table.find(searchQuery);
            System.out.println(cursor.next());
}

Any help will be appreciated.

shilovk
  • 11,718
  • 17
  • 75
  • 74
MDM
  • 11
  • 1
  • 3

4 Answers4

2

Have you tried mongoimport?

You can import it using a command like this:

mongoimport --db dbname --collection collectionname --type csv --headerline --file /home/test.csv

I tried this one and imported a complex 5M rows table/csv in a couple of minutes.

  • 1
    Sorry, I don't have any experience with that. But as long as you're inserting the documents one by one, it will take you a long time. –  Jul 06 '15 at 10:50
1

I would suggest you to try Bulk insert feature of mongoDB. Below is the sample code.

// Sample code
com.mongodb.DBCollection collection = db.getCollection("mycol");

BulkWriteOperation  bulkWriteOperation= collection.initializeUnorderedBulkOperation();

//perform the insert operation in the loop to add objects for bulk execution
for (int i=0;i<100;i++)
{
bulkWriteOperation.insert(new BasicDBObject("_id",Integer.valueOf(i)));
}

// execute bulk operation on mycol collection
BulkWriteResult result=bulkWriteOperation.execute();
Shivaprasad
  • 167
  • 1
  • 9
0

You should really give mongoimport a try. I think you have only to build up the DB once and the use it right and then use it Java and Netbeans.

mongoimport --db users --collection contacts --type csv --headerline --file /opt/backups/yourData.csv

here i have the Link to the official MongoDB Docs for MongoImport and a Stackoverflow Question for a HowToUse

I don't know how fast this approach is but give it a try too.

You also should think about, if MongoDB fits for your usecase.

Community
  • 1
  • 1
bMalum
  • 388
  • 2
  • 13
  • thx for your reply. but my boss want to use an interface to import many files (as needed) so he didn't accept the mongoimport solution! – MDM Jul 06 '15 at 12:05
  • @MDM then i think he has to wait a much longer time. Another Idea would be wrapping mongoimport in Java but you will have a rat-tail on dependencies. – bMalum Jul 06 '15 at 12:13
  • 1
    yep!! now i'm trying another solution if it works i will post it! – MDM Jul 06 '15 at 12:19
0

i found a solution for my problem! i found this project https://github.com/deafgoat/CSV2MongoDB i run it with my file twice. first time it takes 20 min to import the file into my mongodb and the seconde one it takes 45 min to do it (i have some problem with my pc, i thought that's why it tooks all that time!!). thank you !!

MDM
  • 11
  • 1
  • 3