0

I am trying to process records in my database like so:

rows = MyModel.joins("JOIN other_model ON other_model.my_model_id = my_model.id")
                      .joins("JOIN <some other join>")
                      .where("<conditions>")
rows.find_each(batch_size: 50, &:destroy)

I am not specifying any row order, so I expect that there won't be any ORDER BY clause in the final SQL. However, when I run this code, ActiveRecord adds an ORDER BY clause; the query actually looks like this:

... WHERE <conditions> ORDER BY "my_model"."id" ASC LIMIT 50;

This is a problem as I have many records in the table, and ORDER BY clause slows everything down.

I could probably rewrite my code so as to not use ActiveRecord to select the ids, but I'm wondering why ActiveRecord behaves like this. Why does it add an ORDER By when I'm not asking it to? Is there any way to prevent this?

sawa
  • 165,429
  • 45
  • 277
  • 381
user655136
  • 161
  • 2
  • 10
  • Are you sure the slowdown on the database due to sorting is significant enough for you to worry about? – sawa Feb 18 '19 at 05:22
  • Please check your model, might be there is default scope for order by – VijayGupta Feb 18 '19 at 06:35
  • @sawa this is not a duplicate of the marked answer and removing the order will have no impact. The issue is due to `find_each` as part of [`ActiveRecord::Batches`](https://api.rubyonrails.org/v5.2.2/classes/ActiveRecord/Batches.html#method-i-find_each) which clearly states *"NOTE: It's not possible to set the order. That is automatically set to ascending on the primary key (“id ASC”) to make the batch ordering work. This also means that this method only works when the primary key is orderable (e.g. an integer or string)."* – engineersmnky Feb 18 '19 at 19:39
  • 1
    @sawa https://stackoverflow.com/questions/15189937/activerecord-find-each-combined-with-limit-and-order might be a better choice for marking as duplicate. Other options include: https://stackoverflow.com/questions/50183745/each-or-find-each-or-need-smart-each/ , https://stackoverflow.com/questions/48725199/ruby-each-loop-in-order-on-sql-order-query-result/, https://stackoverflow.com/questions/32005165/find-in-batches-does-not-use-given-order-but-uses-id-asc, etc. – engineersmnky Feb 18 '19 at 19:58
  • OP: It is extremely unlikely the order by is slowing you down (ORDER BY on a primary key is very efficient). `destroy` however has callback functionality and will have implications as such (which is probably your issue). If you do not need the destroy callbacks then `rows.find_in_batches(batch_size: 50, &:delete_all)` will be much more efficient as `delete_all` is a pure SQL delete with limited code processing and no callback overhead. – engineersmnky Feb 18 '19 at 20:14
  • @engineersmnky The question that I linked has an answer that says you can `unsccope` with nil argument to remove scoping that a re already activated. – sawa Feb 19 '19 at 07:13
  • 1
    @sawa you cannot remove the order by when using `find_each` no matter what. Ordering by the primary key is a requirement for batching to avoid race conditions ning – engineersmnky Feb 19 '19 at 12:36
  • @engineersmnky Okay. I learned from you. – sawa Feb 19 '19 at 12:44

0 Answers0