2

I have this web app currently on Heroku that takes plain text of mostly comma separated values (or other delimiter separated values) that a user copies-and-pastes into a web form, and the app will then get data from each line and save it to a mongo db.

for instance

45nm, 180, 3
44nm, 180, 3.5
45nm, 90, 7
...

@project = Project.first # project embeds_many :simulations
@array_of_array_of_data_from_csv.each do |line|
  @project.simulations.create(:thick => line[0], :ang => line[1], :pro => line[2])
  #e.g. line[0] => 45nm, line[1] => 180, line[2] => 3
end

For this app's purposes, I can't let the user do any kind of import, we have to get the data from them from a textarea. And each time, the user can paste upto 30,000 lines. I tried doing that (30,000 data points) on Heroku with some fake data in a console, it terminated saying long processes are not supported in console, try rake tasks instead.

So I was wondering if anyone knows either way it takes so long to insert 30,000 documents (of course, it can be that's the the way it is), or knows another way to speedily insert 30,000 documents?

Thanks for your help

Nik So
  • 16,683
  • 21
  • 74
  • 108

1 Answers1

1

If are you inserting that many documents you should be doing it as a batch ... I routinely insert 200,000 document batches and they get created in a snap!

So, instead of making a loop that "creates" / inserts a new document each time just have your loop append an array of documents and then insert that into MongoDB as one big batch.

An example of how to do that with mongoid can be found in this question.

However you should keep in mind this might end up being fairly memory intensive (as the whole array of hashes/documents would be in memory as you build it.)

Just be careful :)

Community
  • 1
  • 1
Justin Jenkins
  • 26,590
  • 6
  • 68
  • 1,285
  • hey thanks for that link, I just tried it and it is very fast when inserting into a collection. But here's the question: because I am not insert a lot of documents into a collection, but inserting a lot of embedded documents into one root document. Do you know of any similar nifty mongo driver commands that do that fast? – Nik So Feb 24 '11 at 03:36
  • @Nik, what is the difference? Doing the insert/updates as individual operations on the server should be the slow part ... if you insert 30,000 in a batch or just one large document at *one time* it should take about the same amount of time? – Justin Jenkins Feb 24 '11 at 18:29