133

So, I've found several examples for finding a random record in Rails 2 -- the preferred method seems to be:

Thing.find :first, :offset => rand(Thing.count)

Being something of a newbie I'm not sure how this could be constructed using the new find syntax in Rails 3.

So, what's the "Rails 3 Way" to find a random record?

fl00r
  • 82,987
  • 33
  • 217
  • 237
Andrew
  • 42,517
  • 51
  • 181
  • 281

15 Answers15

215
Thing.first(:order => "RANDOM()") # For MySQL :order => "RAND()", - thanx, @DanSingerman
# Rails 3
Thing.order("RANDOM()").first

or

Thing.first(:offset => rand(Thing.count))
# Rails 3
Thing.offset(rand(Thing.count)).first

Actually, in Rails 3 all examples will work. But using order RANDOM is quite slow for big tables but more sql-style

UPD. You can use the following trick on an indexed column (PostgreSQL syntax):

select * 
from my_table 
where id >= trunc(
  random() * (select max(id) from my_table) + 1
) 
order by id 
limit 1;
fl00r
  • 82,987
  • 33
  • 217
  • 237
  • 11
    Your first example won't work in MySQL though - the syntax for MySQL is Thing.first(:order => "RAND()") (a danger of writing SQL rather than using the ActiveRecord abstractions) – DanSingerman Mar 17 '11 at 16:58
  • @ DanSingerman, yes it is DB specific `RAND()` or `RANDOM()`. Thanks – fl00r Mar 17 '11 at 16:59
  • And this won't create issues if there are missing items from the index? (if something in the middle of the stack gets deleted, will there be a chance it will be requested? – Victor S Nov 08 '11 at 06:39
  • @VictorS, no it won't #offset just goes to the next available record. I tested it with Ruby 1.9.2 and Rails 3.1 – SooDesuNe Nov 11 '11 at 02:35
  • when you do rand(Thing.count) isn't there a possibility it will return 0 and try to retrieve a record with id of 0, which may not exist? Or is offset accounting for that – JohnMerlino Aug 12 '14 at 16:59
  • 1
    @JohnMerlino, yes 0 is offset, not id. Offet 0 means first item according to order. – fl00r Aug 18 '14 at 08:27
  • This is really slow on a table with millions of rows – Andrew K May 03 '17 at 05:59
  • @AndrewK yes, and this is mentioned in the post. Using offset will be faster, but still, offset is also not very efficient operation. – fl00r May 04 '17 at 19:04
29

I am working on a project (Rails 3.0.15, ruby 1.9.3-p125-perf) where the db is in localhost and users table has a bit more than 100K records.

Using

order by RAND()

is quite slow

User.order("RAND(id)").first

becomes

SELECT users.* FROM users ORDER BY RAND(id) LIMIT 1

and takes from 8 to 12 seconds to respond!!

Rails log:

User Load (11030.8ms) SELECT users.* FROM users ORDER BY RAND() LIMIT 1

from mysql's explain

+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL | NULL    | NULL | 110165 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+--------+---------------------------------+

You can see that no index is used (possible_keys = NULL), a temporary table is created and an extra pass is required to fetch the desired value (extra = Using temporary; Using filesort).

On the other hand, by splitting the query in two parts and using Ruby, we have a reasonable improvement in response time.

users = User.scoped.select(:id);nil
User.find( users.first( Random.rand( users.length )).last )

(;nil for console use)

Rails log:

User Load (25.2ms) SELECT id FROM users User Load (0.2ms) SELECT users.* FROM users WHERE users.id = 106854 LIMIT 1

and mysql's explain proves why:

+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
| id | select_type | table | type  | possible_keys | key                      | key_len | ref  | rows   | Extra       |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+
|  1 | SIMPLE      | users | index | NULL          | index_users_on_user_type | 2       | NULL | 110165 | Using index |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+--------+-------------+

+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | users | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+

we can now use only indexes and the primary key and do the job about 500 times faster!

UPDATE:

as pointed out by icantbecool in comments the above solution has a flaw if there are deleted records in the table.

A workaround in that can be

users_count = User.count
User.scoped.limit(1).offset(rand(users_count)).first

which translates to two queries

SELECT COUNT(*) FROM `users`
SELECT `users`.* FROM `users` LIMIT 1 OFFSET 148794

and runs in about 500ms.

Koen.
  • 25,449
  • 7
  • 83
  • 78
xlembouras
  • 8,215
  • 4
  • 33
  • 42
  • adding ".id" after "last" to your second example will avoid a "couldn't find Model without ID" error. E.g. User.find( users.first( Random.rand( users.length )).last.id ) – turing_machine Mar 28 '14 at 04:27
  • Warning! In MySQL `RAND(id)` will **NOT** give you a different random order every query. Use `RAND()` if want a different order each query. – Justin Tanner Dec 14 '14 at 07:37
  • The User.find( users.first( Random.rand( users.length )).last.id ) will not work if there was a record deleted. [1,2,4,5,] and it potentially could pick the id of 3, but there wouldn't be an active record relation. – icantbecool Jun 02 '15 at 10:27
  • Also, users = User.scoped.select(:id);nil is not deprecated. Use this instead: users = User.where(nil).select(:id) – icantbecool Jun 02 '15 at 10:27
  • I believe using Random.rand( users.length ) as the parameter to first is a bug. Random.rand can return 0. When 0 is used as a parameter to first, the limit is set to zero and this returns no records. What one should use instead is 1 + Random( users.length) assuming users.length > 0. – SWoo Sep 18 '15 at 03:45
12

If using Postgres

User.limit(5).order("RANDOM()")

If using MySQL

User.limit(5).order("RAND()")

In both instances you're selecting 5 records randomly from the Users table. Here is the actual SQL query in displayed in the console.

SELECT * FROM users ORDER BY RANDOM() LIMIT 5
icantbecool
  • 482
  • 8
  • 16
11

I made a rails 3 gem for doing this that performs better on large tables and allows you to chain relations and scopes:

https://github.com/spilliton/randumb

(edit): The default behavior of my gem basically uses the same approach as above now, but you have the option to use the old way if you want :)

spilliton
  • 3,811
  • 5
  • 35
  • 35
6

Many of the answers posted actually won't perform well on rather large tables (1+ million rows). Random ordering quickly takes a few seconds, and doing a count on the table also takes quite long.

A solution that works well for me in this situation is to use RANDOM() with a where condition:

Thing.where('RANDOM() >= 0.9').take

On a table with over a million rows, this query generally takes less than 2ms.

fivedigit
  • 18,464
  • 6
  • 54
  • 58
  • Another advantages of your solution is use `take` function which gives `LIMIT(1)` query but return single element instead of array. So we do not need to invoke `first` – Piotr Galas Feb 05 '18 at 12:35
  • It seems to me that records at the beginning of table have higher probablility beeing selected this way, which might not be what you want to achieve. – gorn Jul 17 '18 at 14:30
5

here we go

rails way

#in your initializer
module ActiveRecord
  class Base
    def self.random
      if (c = count) != 0
        find(:first, :offset =>rand(c))
      end
    end
  end
end

usage

Model.random #returns single random object

or the second thought is

module ActiveRecord
  class Base
    def self.random
      order("RAND()")
    end
  end
end

usage:

Model.random #returns shuffled collection
Tim Kretschmer
  • 2,272
  • 1
  • 22
  • 35
4

The Ruby method for randomly picking an item from a list is sample. Wanting to create an efficient sample for ActiveRecord, and based on the previous answers, I used:

module ActiveRecord
  class Base
    def self.sample
      offset(rand(size)).first
    end
  end
end

I put this in lib/ext/sample.rb and then load it with this in config/initializers/monkey_patches.rb:

Dir[Rails.root.join('lib/ext/*.rb')].each { |file| require file }
Dan Kohn
  • 33,811
  • 9
  • 84
  • 100
  • Actually, `#count` will do a call to the DB for a `COUNT`. If the record is already loaded, this might be a bad idea. A refactor would be to use `#size` instead since it'll decide if `#count` should be used, or, if the record is already loaded, to use `#length`. – BenMorganIO Aug 18 '15 at 07:20
  • Switched from `count` to `size` based on your feedback. More info at: http://dev.mensfeld.pl/2014/09/activerecord-count-vs-length-vs-size-and-what-will-happen-if-you-use-it-the-way-you-shouldnt/ – Dan Kohn Aug 18 '15 at 11:51
4

This was very useful to me however i needed a bit more flexibility, so this is what i did:

Case1: Finding one random recordsource:trevor turk site
Add this to Thing.rb model

def self.random
    ids = connection.select_all("SELECT id FROM things")
    find(ids[rand(ids.length)]["id"].to_i) unless ids.blank?
end

then in your controller you can call something like this

@thing = Thing.random

Case2: Finding multiple random records(no repeats)source:can't remember
I needed to find 10 random records with no repeats so this is what i found worked
In your controller:

thing_ids = Thing.find( :all, :select => 'id' ).map( &:id )
@things = Thing.find( (1..10).map { thing_ids.delete_at( thing_ids.size * rand ) } )

This will find 10 random records, however it is worth mentioning that if the database is particularly large(millions of records), this would not be ideal, and performance will be hampered. Is will perform well up to a few thousand records which was sufficient for me.

Hishalv
  • 3,052
  • 3
  • 29
  • 52
3

Works in Rails 5 and is DB agnostic:

This in your controller:

@quotes = Quote.offset(rand(Quote.count - 3)).limit(3)

You can, of course, put this in a concern as shown here.

app/models/concerns/randomable.rb

module Randomable
  extend ActiveSupport::Concern

  class_methods do
    def random(the_count = 1)
      records = offset(rand(count - the_count)).limit(the_count)
      the_count == 1 ? records.first : records
    end
  end
end

then...

app/models/book.rb

class Book < ActiveRecord::Base
  include Randomable
end

Then you can use simply by doing:

Books.random

or

Books.random(3)
richardun
  • 693
  • 5
  • 11
  • This always takes subsequent records, which needs to be at least documented (as it might not be what user wants). – gorn Jul 17 '18 at 14:33
1

You can use sample() in ActiveRecord

E.g.

def get_random_things_for_home_page
  find(:all).sample(5)
end

Source: http://thinkingeek.com/2011/07/04/easily-select-random-records-rails/

Trond
  • 195
  • 1
  • 2
  • 34
    This is a very bad query to use if you have a large amount of records, as the DB will select ALL records, then Rails will pick five records from that - massively wasteful. – DaveStephens Jan 02 '13 at 12:27
  • 5
    `sample` isn't in ActiveRecord, sample is in Array. http://api.rubyonrails.org/classes/Array.html#method-i-sample – Frans Mar 31 '13 at 01:59
  • 3
    This is an expensive way to get a random record, especially from a large table. Rails will load an object for every record from your table into memory. If you need proof, run 'rails console', try 'SomeModelFromYourApp.find(:all).sample(5)' and look at the SQL produced. – Eliot Sykes Jul 18 '13 at 16:56
  • 1
    See my answer, which turns this expensive answer into a streamlined beauty for getting multiple random records. – Arcolye Aug 27 '13 at 12:24
1

If using Oracle

User.limit(10).order("DBMS_RANDOM.VALUE")

Output

SELECT * FROM users ORDER BY DBMS_RANDOM.VALUE WHERE ROWNUM <= 10
Marcelo Austria
  • 861
  • 8
  • 16
1

Strongly Recommend this gem for random records, which is specially designed for table with lots of data rows:

https://github.com/haopingfan/quick_random_records

All other answers perform badly with large database, except this gem:

  1. quick_random_records only cost 4.6ms totally.

enter image description here

  1. the accepted answer User.order('RAND()').limit(10) cost 733.0ms.

enter image description here

  1. the offset approach cost 245.4ms totally.

enter image description here

  1. the User.all.sample(10) approach cost 573.4ms.

enter image description here

Note: My table only has 120,000 users. The more records you have, the more enormous the difference of performance will be.


UPDATE:

Perform on table with 550,000 rows

  1. Model.where(id: Model.pluck(:id).sample(10)) cost 1384.0ms

enter image description here

  1. gem: quick_random_records only cost 6.4ms totally

enter image description here

Derek Fan
  • 817
  • 11
  • 10
0

Random sorting by RDBMS can be quite expensive if there are millions of records. To simplify this, you can limit the number of sorted records such way (PostgreSQL syntax):

module ActiveRecord
  class Base
    def self.sample
      where(
        "id >= TRUNC(RANDOM() * (SELECT MAX(id) FROM #{table_name}) + 1)"
      ).order(:id).first
    end
  end
end

And then User.sample

This will work more randomly in the case of ids uniform distribution

mechnicov
  • 12,025
  • 4
  • 33
  • 56
-2

A very easy way to get multiple random records from the table. This makes 2 cheap queries.

Model.where(id: Model.pluck(:id).sample(3))

You can change the "3" to the number of random records you want.

Arcolye
  • 6,968
  • 4
  • 34
  • 28
-5

I just ran into this issue developing a small application where I wanted to select a random question from my DB. I used:

@question1 = Question.where(:lesson_id => params[:lesson_id]).shuffle[1]

And it's working well for me. I can't speak on how the performance for larger DBs since this is just a small application.

zishe
  • 10,665
  • 12
  • 64
  • 103
rails_newbie
  • 53
  • 1
  • 10
  • 1
    Yeah, this is just getting all your records and using ruby array methods on them. The drawback there is of course that it means loading all your records into memory, then randomly reordering them, then grabbing the second item in the reordered array. That could definitely be a memory hog if you were dealing with a large dataset. Minor aside, why not grab the first element? (ie. `shuffle[0]`) – Andrew Apr 04 '13 at 23:04
  • must be shuffle[0] – Marcelo Austria Aug 02 '17 at 09:13