I found that my Model.create!
statements were taking a very long time to run when I added a large number of records at once. Looked at ActiveRecord-Import but it didn't work with an array of hashes (which is what I have and which I think is pretty common). How can I improve the performance?

- 1,778
- 2
- 19
- 25
-
See also: [How to implement bulk insert in Rails 3](http://stackoverflow.com/questions/8505263/how-to-implement-bulk-insert-in-rails-3) and [Batch insertion in rails 3](http://stackoverflow.com/questions/15784305/batch-insertion-in-rails-3). – Apr 13 '14 at 05:05
-
@Cupcake Those two references talk about using ActiveRecord-Import (which doesn't support an array of hashes like create! does) and writing SQL directly. See the accepted answer for what I did. – Jack R-G Apr 14 '14 at 17:46
-
this gem https://github.com/bjhaid/active_record_bulk_insert aims to solve this kind of challenge – bjhaid May 31 '14 at 12:31
-
may be it help https://cbabhusal.wordpress.com/2015/01/03/updating-multiple-records-at-the-same-time-rails-activerecord/ – Shiva Aug 18 '15 at 02:44
-
1@illusionist That article addresses a very specific batch update: when all records are to be updated with the same value. For multiple record update with multiple values, the best approach is similar to that taken for bulk insert (see my answer on http://stackoverflow.com/questions/18999563/rails-faster-way-to-perform-updates-on-many-records/27081266#27081266) – Jack R-G Aug 19 '15 at 03:26
7 Answers
Use the activerecord-import gem. Let us say you are reading a CSV file and generating a Product
catalogue and you want to insert records in batches of 1000:
batch,batch_size = [], 1_000
CSV.foreach("/data/new_products.csv", :headers => true) do |row|
batch << Product.new(row)
if batch.size >= batch_size
Product.import batch
batch = []
end
end
Product.import batch

- 64,083
- 21
- 152
- 198
-
3I did mention activerecord-import in my answer, but it does not address my specific case, which is an array of hashes (I believe this to be a very typical use case; it certainly was for me). Bottom line: if ar-import had support for an array of hashes I would have used it and not written my own code. I put this out there as another alternative. – Jack R-G Mar 11 '13 at 02:54
-
2I missed your mention of `activerecord-import`. The example I have given deals with array of hashes(The csv row is a hash). If you already have the array of hashes you can process them using the technique used above. – Harish Shetty Mar 11 '13 at 06:30
-
2Looking into this in the active record docs and found that there's actually built in functionality to support that in the [create method](http://railsapi.com/doc/rails-v2.3.8/classes/ActiveRecord/Base.html#M000969) `# Creating an Array of new objects using a block, where the block is executed for each object: User.create([{ :first_name => 'Jamie' }, { :first_name => 'Jeremy' }]) do |u| u.is_admin = false end` – Máté Jul 31 '13 at 11:22
-
2@matov That's true but inefficient. It generates one SQL Insert for each record. Try doing that with 10000 records -- it's rather slow. – Jack R-G Apr 14 '14 at 17:38
-
@JackR-G yeah, I'm aware. there are gems to support that sort of functionality, like activerecord-import you mentioned but I found that not good enough when you deal with larger data sets. – Máté May 15 '14 at 11:37
Thanks to Chris Heald @cheald for his 2009 article, with showed me that the best way to go was the multi-row insert command.
Added the following code to my initializers/active_record.rb
file, changed my Model.create!(...)
calls to Model.import!(...)
and away it goes. A couple caveats:
1) It does not validate the data.
2) It uses the form of the SQL INSERT command that reads like ...
INSERT INTO <table> (field-1, field-2, ...)
VALUES (value-1-1, value-1-2, ...), (value-2-1, value-2-2, ...), ...`
... which may not be the correct syntax for all databases, but it works with Postgres. It would not be difficult to alter the code for the appropriate syntax for your SQL version.
In my particular case, inserting 19K+ records into a simple table on my development machine (MacBook Pro with 8GB RAM, 2.4GHz Intel Core i5 and and SSD) went from 223 seconds using 'model.create!' to 7.2 seconds using a 'model.import!'.
class ActiveRecord::Base
def self.import!(record_list)
raise ArgumentError "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
key_list, value_list = convert_record_list(record_list)
sql = "INSERT INTO #{self.table_name} (#{key_list.join(", ")}) VALUES #{value_list.map {|rec| "(#{rec.join(", ")})" }.join(" ,")}"
self.connection.insert_sql(sql)
end
def self.convert_record_list(record_list)
key_list = record_list.map(&:keys).flatten.uniq.sort
value_list = record_list.map do |rec|
list = []
key_list.each {|key| list << ActiveRecord::Base.connection.quote(rec[key]) }
list
end
return [key_list, value_list]
end
end
I started running into problems with large numbers of records (> 10000), so I modified the code to work in groups of 1000 records at a time. Here is a link to the new code:

- 1,778
- 2
- 19
- 25
-
Thank you for your gist. I'm using `AR-SQLServer-adapter` and I had to change `self.connection.insert_sql(sql)` TO `self.connection.execute(sql)`. Really FAST! – Fernando Fabreti Aug 21 '14 at 00:26
-
Glad you could make use of it. It's very fast compared to create!(array-of-hashes) – Jack R-G Aug 22 '14 at 02:24
-
yeah, I have found it to be 70x faster! I have made a bulk_update based on your code: http://stackoverflow.com/a/25430089/873650 – Fernando Fabreti Aug 22 '14 at 14:27
-
Nice! Clever use of the FROM syntax to join the target with the CSV-list. – Jack R-G Aug 24 '14 at 06:55
-
about the steping, you should use: `(0..record_list.count-1).step(1000).each do |start|` and `key_list, value_list = convert_record_list(record_list[start..start+999])` see gist comments. – Fernando Fabreti Apr 28 '15 at 10:51
-
You can also use the activerecord-insert_many gem. Just make an array of objects!
events = [{name: "Movie Night", time: "10:00"}, {name: "Tutoring", time: "7:00"}, ...]
Event.insert_many(events)
Using a transaction speeds up bulk inserts a lot!
Model.transaction do
many.times{ Model.create! }
end
If multiple Models are involved, do a Model.transaction for each model, which is affected:
Model1.transaction do
Model2.transaction do
many.times do
m1 = Model1.create!
m1.add_model2
end
end
end

- 316
- 3
- 4
-
2I create a 1000 records in a transaction, it's still iterative and takes 30 seconds. I don't think this is a good solution. – max pleaner Feb 25 '17 at 00:55
-
It is. But it is only one part of the solution. To improve performance much more, you can use plain SQL. If you commit each record rather than using transactions, inserts are much slower. – tvw Apr 20 '17 at 11:18
https://stackoverflow.com/a/15318202/9732392
This answer is well explained, but in my opinion if we use array instead of Product.new(row)
like below it might be more faster
batch,batch_size = [], 1000
CSV.foreach("/data/new_products.csv", :headers => true) do |row|
batch << [row[:part_number], row[:item_name], row[:cost]]
if batch.size >= batch_size
product_columns = [:part_number, :item_name, :cost]
Product.import product_columns, batch, on_duplicate_key_update: {conflict_target: [:id], columns: [:part_number, :item_name, :cost]}
batch = []
end
end
Product.import product_columns, batch, on_duplicate_key_update: {conflict_target: [:id], columns: [:part_number, :item_name, :cost]} if batch.present?
more info read? https://github.com/zdennis/activerecord-import

- 1
- 1