29

I want to obtain an array of ActiveRecord objects given an array of ids.

I assumed that

Object.find([5,2,3])

Would return an array with object 5, object 2, then object 3 in that order, but instead I get an array ordered as object 2, object 3 and then object 5.

The ActiveRecord Base find method API mentions that you shouldn't expect it in the order provided (other documentation doesn't give this warning).

One potential solution was given in Find by array of ids in the same order?, but the order option doesn't seem to be valid for SQLite.

I can write some ruby code to sort the objects myself (either somewhat simple and poorly scaling or better scaling and more complex), but is there A Better Way?

mu is too short
  • 426,620
  • 70
  • 833
  • 800
Andrew Grimm
  • 78,473
  • 57
  • 200
  • 338
  • 1
    Where are those id's coming from? If it is the UI (via a user selecting them) then scaling shouldn't be an issue, that is the user is unlikely to spend time selecting 1000's of ids). If it's the database (e.g. from a join table), could you store the order in the join table and issue the find based on that? – pauliephonic Apr 29 '09 at 11:13
  • It looks like this is [no longer true](https://stackoverflow.com/a/50728424/1507845) in Rails 5. – XML Slayer Jun 06 '18 at 19:49

10 Answers10

23

It's not that MySQL and other DBs sort things on their own, it's that they don't sort them. When you call Model.find([5, 2, 3]), the SQL generated is something like:

SELECT * FROM models WHERE models.id IN (5, 2, 3)

This doesn't specify an order, just the set of records you want returned. It turns out that generally MySQL will return the database rows in 'id' order, but there's no guarantee of this.

The only way to get the database to return records in a guaranteed order is to add an order clause. If your records will always be returned in a particular order, then you can add a sort column to the db and do Model.find([5, 2, 3], :order => 'sort_column'). If this isn't the case, you'll have to do the sorting in code:

ids = [5, 2, 3]
records = Model.find(ids)
sorted_records = ids.collect {|id| records.detect {|x| x.id == id}} 
tomafro
  • 5,788
  • 3
  • 26
  • 22
  • 14
    Another solution that is more performant than using #detect to avoid O(N) performance: records = Model.find(ids).group_by(&:id); sorted_records = ids.map { |id| records[id].first } – Ryan LeCompte Dec 17 '11 at 00:04
10

Based on my previous comment to Jeroen van Dijk you can do this more efficiently and in two lines using each_with_object

result_hash = Model.find(ids).each_with_object({}) {|result,result_hash| result_hash[result.id] = result }
ids.map {|id| result_hash[id]}

For reference here is the benchmark i used

ids = [5,3,1,4,11,13,10]
results = Model.find(ids)

Benchmark.measure do 
  100000.times do 
    result_hash = results.each_with_object({}) {|result,result_hash| result_hash[result.id] = result }
    ids.map {|id| result_hash[id]}
  end
end.real
#=>  4.45757484436035 seconds

Now the other one

ids = [5,3,1,4,11,13,10]
results = Model.find(ids)
Benchmark.measure do 
  100000.times do 
    ids.collect {|id| results.detect {|result| result.id == id}}
  end
end.real
# => 6.10875988006592

Update

You can do this in most using order and case statements, here is a class method you could use.

def self.order_by_ids(ids)
  order_by = ["case"]
  ids.each_with_index.map do |id, index|
    order_by << "WHEN id='#{id}' THEN #{index}"
  end
  order_by << "end"
  order(order_by.join(" "))
end

#   User.where(:id => [3,2,1]).order_by_ids([3,2,1]).map(&:id) 
#   #=> [3,2,1]
Substantial
  • 6,684
  • 2
  • 31
  • 40
Schneems
  • 14,918
  • 9
  • 57
  • 84
7

Apparently mySQL and other DB management system sort things on their own. I think that you can bypass that doing :

ids = [5,2,3]
@things = Object.find( ids, :order => "field(id,#{ids.join(',')})" )
Substantial
  • 6,684
  • 2
  • 31
  • 40
marcgg
  • 65,020
  • 52
  • 178
  • 231
  • That was the answer suggested in the link "Find by arrays of id in the same order?", but it doesn't seem to work for SQLite. – Andrew Grimm Apr 29 '09 at 23:30
  • 2
    This no longer works. For more recent Rails: `Object.where(id: ids).order("field(id, #{ids.join ','})")` – mahemoff Apr 18 '15 at 08:37
  • Improved: `Object.where(id: ids).order(ids.present? && "field(id, #{ids.join ','})")`. This prevents a SQL error if the IDs array is empty or nil. – mahemoff Apr 18 '15 at 09:09
6

A portable solution would be to use an SQL CASE statement in your ORDER BY. You can use pretty much any expression in an ORDER BY and a CASE can be used as an inlined lookup table. For example, the SQL you're after would look like this:

select ...
order by
    case id
    when 5 then 0
    when 2 then 1
    when 3 then 2
    end

That's pretty easy to generate with a bit of Ruby:

ids = [5, 2, 3]
order = 'case id ' + (0 .. ids.length).map { |i| "when #{ids[i]} then #{i}" }.join(' ') + ' end'

The above assumes that you're working with numbers or some other safe values in ids; if that's not the case then you'd want to use connection.quote or one of the ActiveRecord SQL sanitizer methods to properly quote your ids.

Then use the order string as your ordering condition:

Object.find(ids, :order => order)

or in the modern world:

Object.where(:id => ids).order(order)

This is a bit verbose but it should work the same with any SQL database and it isn't that difficult to hide the ugliness.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • 5
    I did it with `Object.order('case id ' + ids.each_with_index.map { |id, i| "when #{id} then #{i}" }.join(' ') + ' end')` – Pete Hamilton Feb 13 '14 at 14:00
  • 1
    or to sql injection escape, put this in the map block: `sanitize_sql_array(["when ? then ? ", id, i])` (works within an AR model) – nruth May 01 '16 at 22:19
  • 1
    @nruth: There was an unstated assumption that `ids` contained numbers, assumptions are a bad idea (doubly so when they're not explicit) so I've hopefully cleared that up. – mu is too short May 02 '16 at 06:00
  • 1
    @muistooshort it's still a good answer, and generally a reasonable assumption, especially in 2013. But with javascript front-ends and uuids in fashion it seemed worth a mention, as Google points here for solving the ordering problem. – nruth May 03 '16 at 19:50
4

As I answered here, I just released a gem (order_as_specified) that allows you to do native SQL ordering like this:

Object.where(id: [5, 2, 3]).order_as_specified(id: [5, 2, 3])

Just tested and it works in SQLite.

Community
  • 1
  • 1
JacobEvelyn
  • 3,901
  • 1
  • 40
  • 51
3

Justin Weiss wrote a blog article about this problem just two days ago.

It seems to be a good approach to tell the database about the preferred order and load all records sorted in that order directly from the database. Example from his blog article:

# in config/initializers/find_by_ordered_ids.rb
module FindByOrderedIdsActiveRecordExtension
  extend ActiveSupport::Concern
  module ClassMethods
    def find_ordered(ids)
      order_clause = "CASE id "
      ids.each_with_index do |id, index|
        order_clause << "WHEN #{id} THEN #{index} "
      end
      order_clause << "ELSE #{ids.length} END"
      where(id: ids).order(order_clause)
    end
  end
end

ActiveRecord::Base.include(FindByOrderedIdsActiveRecordExtension)

That allows you to write:

Object.find_ordered([2, 1, 3]) # => [2, 1, 3]
spickermann
  • 100,941
  • 9
  • 101
  • 131
  • Nice drop-in solution but only applies to the `id`. Would be nice to be able to dynamically determine the attribute to order by, like `uid` or something. – Joshua Pinter Sep 09 '15 at 19:03
  • @JoshPinter: Just add a parameter `field_name` to the method and use that `field_name` instead of `"CASE id"` like this: `"CASE #{field_name} "` – spickermann Sep 09 '15 at 21:12
  • Yep, thanks for the follow-up. Was thinking more of allowing for `find_ordered_by_uid(uids)` but it's just building off what you said. Cheers. – Joshua Pinter Sep 09 '15 at 22:01
  • Recommend changing the name of the method to `find_by_ordered_ids` to be more explicit and match the name of the module file. – Joshua Pinter Nov 12 '15 at 16:38
  • If using a a different field than `id` that isn't an `Integer` then make sure you wrap `#{id}` with parenthesis, like so `"WHEN '#{uid}' THEN #{index} ". Can't thank you enough @spickermann for this little snippet of code - works great! – Joshua Pinter Nov 12 '15 at 17:00
  • if you're using uuid strings because of distribution (accepting ids from clients) they need escaping or you're looking at sql injection attacks. Something like `sanitize_sql_array(["when ? then ? ", id, index])` in an AR model. – nruth May 01 '16 at 22:23
  • @alexandrecosta You can use `return where('null') if ids.blank?` and it'll return an empty `ActiveRecord::Relation` when `ids` is an empty Array. – Joshua Pinter Jun 17 '16 at 01:58
  • I would recommend to use ActiveRecord's [`none`](http://apidock.com/rails/v4.2.1/ActiveRecord/QueryMethods/none) scope instead of `where('null')`. – spickermann Jun 21 '16 at 08:12
  • This will create an ambiguous query when using JOINs. To fix that use `order_clause = "CASE #{self.table_name}.id "` instead. – Joshua Pinter Dec 09 '17 at 21:14
2

Here's a performant (hash-lookup, not O(n) array search as in detect!) one-liner, as a method:

def find_ordered(model, ids)
  model.find(ids).map{|o| [o.id, o]}.to_h.values_at(*ids)
end

# We get:
ids = [3, 3, 2, 1, 3]
Model.find(ids).map(:id)          == [1, 2, 3]
find_ordered(Model, ids).map(:id) == ids
Kanat Bolazar
  • 526
  • 5
  • 9
1

Here's the simplest thing I could come up with:

ids = [200, 107, 247, 189]
results = ModelObject.find(ids).group_by(&:id)
sorted_results = ids.map {|id| results[id].first }
jasongarber
  • 2,136
  • 2
  • 18
  • 12
1

Another (probably more efficient) way to do it in Ruby:

ids = [5, 2, 3]
records_by_id = Model.find(ids).inject({}) do |result, record| 
  result[record.id] = record
  result
end
sorted_records = ids.map {|id| records_by_id[id] }
Jeroen van Dijk
  • 1,029
  • 10
  • 16
  • you can use each_with_object to make this code two lines FWIW. also did some benchmarking and it looks like this code is _much_ faster if you're iterating over a large set. – Schneems Sep 13 '11 at 19:07
  • Thanks for your comment. I didn't know of the existence of #each_with_object (http://api.rubyonrails.org/classes/Enumerable.html#method-i-each_with_object). Is there a significant difference between the above #inject and #each_with_object approach? – Jeroen van Dijk Sep 14 '11 at 14:19
  • Nope, each_with_object is essentially the same as inject except you don't have to return the object your modifying (in your case `result`) at the end of your block. It simplifies building hashes IMHO. – Schneems Sep 16 '11 at 20:14
0
@things = [5,2,3].map{|id| Object.find(id)}

This is probably the easiest way, assuming you don't have too many objects to find, since it requires a trip to the database for each id.

jcnnghm
  • 7,426
  • 7
  • 32
  • 38
  • 2
    you shouldn't do this (N+1 queries) – Schneems Sep 13 '11 at 19:20
  • It's N queries, not N+1. There are much better ways to do this, but as I said, this is the easiest. For performance, look them all up, then dump them in a hash with the ids as keys. – jcnnghm Sep 14 '11 at 02:43