3

In my rails app I have a method that looks like this:

def find_person(field, attribute)
  Person.where.not(Hash[field, nil])
      .where.not("lower(#{field})  = ?", attribute.downcase)
      .where("difference(#{field}, ?) < 2", attribute)
      .order('RANDOM()')
      .limit(3)
      .pluck("lower(#{field})")
      .uniq
end

This query is very slow because of .order('RANDOM()'). How can I make this query faster?

Ram
  • 3,092
  • 10
  • 40
  • 56
Mateusz Urbański
  • 7,352
  • 15
  • 68
  • 133
  • 1
    Will this question be of any help? http://stackoverflow.com/questions/1594483/optimizing-my-mysql-statement-rand-too-slow – BroiSatse Sep 15 '15 at 13:12
  • 3
    Try limiting the list of randoms first. Here is a pretty good [Step by Step Article](https://www.periscope.io/blog/how-to-sample-rows-in-sql-273x-faster.html) on how this works and how to implement. – engineersmnky Sep 15 '15 at 13:12
  • 1
    some more links http://www.depesz.com/2007/09/16/my-thoughts-on-getting-random-row/ and http://blog.rhodiumtoad.org.uk/2009/03/08/selecting-random-rows-from-a-table/ – Arup Rakshit Sep 15 '15 at 13:13
  • can you post an "explain analyze"? the reason is: "order by random()" cannot make use of presorted data using an index. therefore it is slow. you cannot index on random values . – Hans-Jürgen Schönig Sep 15 '15 at 13:14
  • 2
    @engineersmnky - Awesome article, thanks. :) – BroiSatse Sep 15 '15 at 13:15
  • 1
    http://stackoverflow.com/questions/1244555/how-can-i-optimize-mysqls-order-by-rand-function/25882872 – Zakwan Sep 15 '15 at 13:15

1 Answers1

1

Another approach, use Ruby to get the 3 random records.

def find_person(field, attribute)
  ids = Person.where.not(Hash[field, nil])
      .where.not("lower(#{field})  = ?", attribute.downcase)
      .where("difference(#{field}, ?) < 2", attribute)
      .pluck("id")

  Person.where(:id => ids.sample(3))
end

If pluck is too slow, another approach.

def find_person(field, attribute)
  q = Person.where.not(Hash[field, nil])
    .where.not("lower(#{field})  = ?", attribute.downcase)
    .where("difference(#{field}, ?) < 2", attribute)

  max = q.count
  offsets = [rand(max), rand(max), rand(max)]
  offsets.map { |o| q.offset(o).limit(1).pluck("lower(#{field})").first }

end
Mark Swardstrom
  • 17,217
  • 6
  • 62
  • 70