44

I want to do a batch insert of few thousand records into the database (POSTGRES in my case) from within my Rails App.

What would be the "Rails way" of doing it? Something which is fast and also correct way of doing it.

I know I can create the SQL query by string concatenation of the attributes but I want a better approach.

Wazery
  • 15,394
  • 19
  • 63
  • 95
phoenixwizard
  • 5,079
  • 7
  • 27
  • 38
  • See also: [How to implement bulk insert in Rails 3](http://stackoverflow.com/questions/8505263/how-to-implement-bulk-insert-in-rails-3) and [Bulk Insert records into Active Record table](http://stackoverflow.com/questions/15317837/bulk-insert-records-into-active-record-table). –  Apr 13 '14 at 05:06

4 Answers4

65

ActiveRecord .create method supports bulk creation. The method emulates the feature if the DB doesn't support it and uses the underlying DB engine if the feature is supported.

Just pass an array of options.

# Create an Array of new objects
User.create([{ :first_name => 'Jamie' }, { :first_name => 'Jeremy' }])

Block is supported and it's the common way for shared attributes.

# Creating an Array of new objects using a block, where the block is executed for each object:
User.create([{ :first_name => 'Jamie' }, { :first_name => 'Jeremy' }]) do |u|
  u.is_admin = false
end
Simone Carletti
  • 173,507
  • 49
  • 363
  • 364
  • So do you think for the case of postgres it will create a single insert statement? – phoenixwizard Apr 03 '13 at 10:56
  • 2
    It may depends on the driver version and PG version. You can try it in your console and see the SQL statement that is executed. – Simone Carletti Apr 03 '13 at 10:57
  • 3
    It seems it is creating separate queries. Though surrounding it by transaction seems to be making it faster. Any way I can ensure single query insert ? – phoenixwizard Apr 03 '13 at 11:03
  • 2
    Not sure the PG driver supports it at this time. https://bitbucket.org/ged/ruby-pg/pull-request/13 – Simone Carletti Apr 03 '13 at 11:06
  • I think I will go with your method wrapped in a transaction. Lets hope the driver is updated soon :) – phoenixwizard Apr 03 '13 at 11:50
  • 2
    I ended up using activerecord-import gem. It seems to do the trick :) – phoenixwizard Apr 03 '13 at 16:31
  • 42
    This method doesn't do any magic. It just loops over the array and calls `create` on each individual hash. https://github.com/rails/rails/blob/3e88b9ed0e706906fb4aafe1ea85f990b4f42f48/activerecord/lib/active_record/persistence.rb#L29 – lobati Sep 16 '14 at 23:45
  • Use https://github.com/zdennis/activerecord-import gem if you have any significant importing to do. This way import is slowwww.. – abhishek77in Jul 04 '19 at 09:16
22

I finally reached a solution after the two answers of @Simone Carletti and @Sumit Munot.

Until the postgres driver supports the ActiveRecord .create method's bulk insertion, I would like to go with activerecord-import gem. It does bulk insert and that too in a single insert statement.

books = []
10.times do |i| 
    books << Book.new(:name => "book #{i}")
end
Book.import books

In POSTGRES it lead to a single insert statemnt.

Once the postgres driver supports the ActiveRecord .create method's bulk insertion in a single insert statement, then @Simone Carletti 's solution makes more sense :)

phoenixwizard
  • 5,079
  • 7
  • 27
  • 38
  • 1
    Sadly this is still the best solution for such a simple problem. The 'write raw sql' solutions work and may be easier for one offs but if you do this more than once then this gem is the best solution to not sprinkle raw sql (or other languages) everywhere. –  Jun 25 '14 at 07:22
  • It says method missing is postgres-9.4, Ubuntu 14.04, rails 4.2 – Anwar Oct 22 '15 at 13:12
  • 1
    @Anwar: Do you have `gem 'activerecord-import'` in your Gemfile? This is required to use `Model#import`. – Pete Feb 26 '16 at 22:07
3

You can create a script in your rails model, write your queries to insert in that script In rails you can run the script using

rails runner MyModelName.my_method_name

Is the best way that i used in my project.

Update:

I use following in my project but it is not proper for sql injection. if you are not using user input in this query it may work for you

user_string = " ('a@ao.in','a'), ('b@ao.in','b')"
User.connection.insert("INSERT INTO users (email, name) VALUES"+user_string)

For Multiple records:

new_records = [
  {:column => 'value', :column2 => 'value'}, 
  {:column => 'value', :column2 => 'value'}
]

MyModel.create(new_records)
Sumit Munot
  • 3,748
  • 1
  • 32
  • 51
0

You can do it the fast way or the Rails way ;) The best way in my experience to import bulk data to Postgres is via CSV. What will take several minutes the Rails way will take several seconds using Postgres' native CSV import capability.

http://www.postgresql.org/docs/9.2/static/sql-copy.html

It even triggers database triggers and respects database constraints.

Edit (after your comment): Gotcha. In that case you have correctly described your two options. I have been in the same situation before, implemented it using the Rails 1000 save! strategy because it was the simplest thing that worked, and then optimized it to the 'append a huge query string' strategy because it was an order of magnitude better performing.

Of course, premature optimization is the root of all evil, so perhaps do it the simple slow Rails way, and know that building a big query string is a perfectly legit technique for optimization at the expense of maintainabilty. I feel your real question is 'is there a Railsy way that doesn't involve 1000's of queries?' - unfortunately the answer to that is no.

Chris Aitchison
  • 4,656
  • 1
  • 27
  • 43
  • I think my question is not clear. I am thinking of the use case that a user logins with his facebook account I am saving all his friends. I am expecting around 1000 to 4000 records at a time. I want to do it from within my Rails App – phoenixwizard Apr 03 '13 at 10:34