Random record in ActiveRecord

Question

I'm in need of getting a random record from a table via ActiveRecord. I've followed the example from Jamis Buck from 2006.

However, I've also come across another way via a Google search (can't attribute with a link due to new user restrictions):

 rand_id = rand(Model.count)
 rand_record = Model.first(:conditions => ["id >= ?", rand_id])

I'm curious how others on here have done it or if anyone knows what way would be more efficient.

2 points that might help an answer. 1. How evenly distributed are your ids, are they sequential? 2. How random does it need to be? Good enough random, or real random? — Michael, May 02 '10 at 02:42
They are sequential ids that are auto generated by activerecord and it just has to be good enough. — jyunderwood, May 02 '10 at 03:04
Then your proposed solution is close to ideal :) I'd use "SELECT MAX(id) FROM table_name" instead of the COUNT(*) as it will deal with deleted rows a bit better, otherwise, the rest is fine. In short, if "good enough" is ok, then you just have to have a method that assumes a distribution close to what you actually have. If it's uniform and even as you've said, simple rand works great. — Michael, May 02 '10 at 03:39

score 271 · Answer 1 · edited Mar 15 '23 at 00:27

271

Rails 7

For MySQL:

Model.order("RAND()").limit(1)

Postgres/SQLite:

Model.order("RANDOM()").limit(1)

Rails 6

As stated by Jason in the comments, in Rails 6, non-attribute arguments are not allowed. You must wrap the value in an Arel.sql() statement.

Model.order(Arel.sql('RANDOM()')).first

Rails 5, 4

In Rails 4 and 5, using Postgresql or SQLite, using RANDOM():

Model.order('RANDOM()').first

Presumably the same would work for MySQL with RAND()

Model.order('RAND()').first

This is about 2.5 times faster than the approach in the accepted answer.

Caveat: This is slow for large datasets with millions of records, so you might want to add a limit clause.

edited Mar 15 '23 at 00:27

spnkr

952
9
18

answered Aug 29 '14 at 22:26

Mohamad

34,731
32
140
219

4

"Random()" also works in sqlite, so for those of us still developing on sqlite and running postgres in production, your solution works in both environments. – wuliwong Nov 29 '14 at 16:29
5

I created a [benchmark](https://gist.github.com/panmari/73a2c203d24e7e9461d1) for this against the accepted answer. On Postgresql 9.4 the approach of this answer is about twice as fast. – panmari Jan 11 '15 at 20:58
3

Looks like it is not recommended on mysql http://www.webtrenches.com/post.cfm/avoid-rand-in-mysql – Prakash Murthy Oct 15 '15 at 17:30
This is fastest solution – Sergio Belevskij Jan 12 '17 at 14:25
Be careful! The solution looks neat but is slow for large tables because it needs to sort the whole table. More [here](https://stackoverflow.com/q/5297396/1815779) and [here](https://stackoverflow.com/a/8674844/1815779) – Linh Dam Apr 27 '18 at 03:17
1

"Non-attribute arguments will be disallowed in Rails 6.0. This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql()." – Trenton Tyler Dec 26 '18 at 18:11
1

`.order(Arel.sql('random()'))` https://github.com/rails/rails/issues/32995 – Jason Deppen Jan 06 '20 at 17:15
For mysql `Model.order(Arel.sql('RAND()'))` – Abhi Jul 10 '21 at 09:52
3

`Model.except(:order).order(Arel.sql('RANDOM()')).first` if you have a default scope that overrides the order – Mat Aug 11 '21 at 16:01
`MyModel.order('RANDOM()')` still seems to work fine on Rails 6.1.4.7 using Postgres... ¯\\_(ツ)_/¯ – Jimbali Mar 03 '23 at 15:35
This being database specific is an issue when multiple databases need to be supported. – akostadinov Mar 15 '23 at 21:31
You better use `#take` instead of `first` and the limit will apply `LIMIT` automatically. – akostadinov Jun 29 '23 at 12:12

score 145 · Accepted Answer · edited Sep 24 '14 at 20:13

145

I haven't found an ideal way to do this without at least two queries.

The following uses a randomly generated number (up to the current record count) as an offset.

offset = rand(Model.count)

# Rails 4
rand_record = Model.offset(offset).first

# Rails 3
rand_record = Model.first(:offset => offset)

To be honest, I've just been using ORDER BY RAND() or RANDOM() (depending on the database). It's not a performance issue if you don't have a performance issue.

edited Sep 24 '14 at 20:13

Jonathan Allard

18,429
11
54
75

answered May 02 '10 at 12:09

Toby Hede

36,755
28
133
162

2

The code `Model.find(:offset => offset).first` will throw error. I think `Model.first(:offset => offset)` might perform better. – Harish Shetty May 03 '10 at 00:30
1

yeah, i have been working with Rails 3 and keep getting confused about the query formats between versions. – Toby Hede May 03 '10 at 00:39
7

Note that using offset is very slow with large dataset, since it actually needs index scan (or table scan, in case clustered index is used like InnoDB). In other words, it's O(N) operation but "WHERE id >= #{rand_id} ORDER BY id ASC LIMIT 1" is O(log N), which is much faster. – kenn May 06 '11 at 01:02
18

Be aware that the offset-approach only yields a single randomly found data point (the first, all after are still sorted by id). If you need multiple randomly selected records you must use this approach multiple times or use the random order method provided by your database, i.e. `Thing.order("RANDOM()").limit(100)` for 100 randomly selected entries. (Be aware that it's `RANDOM()` in PostgreSQL and `RAND()` in MySQL ... not as portable as you may want it to be.) – Florian Pilz Jun 16 '11 at 13:17
Beware of the edge case when `Model.count == 1`. – Martin Harrigan Jan 02 '12 at 13:39
I prefer this approach. If you are using Active Record to begin with, then you may as well keep the randomization logic within Ruby rather than using a construct that is database dependent. – Adrien Lamothe Jun 09 '13 at 01:03
3

Doesn't work for me on Rails 4. Use `Model.offset(offset).first`. – mahemoff Jun 06 '14 at 02:03
it's not supposed to week if the id is UUID right?, anyone know how to do it with uuid? – buncis Apr 05 '19 at 17:58

score 73 · Answer 3 · answered May 02 '10 at 02:40

73

Your example code will start to behave inaccurately once records are deleted (it will unfairly favor items with lower ids)

You're probably better off using the random methods within your database. These vary depending on which DB you're using, but :order => "RAND()" works for mysql and :order => "RANDOM()" works for postgres

Model.first(:order => "RANDOM()") # postgres example

answered May 02 '10 at 02:40

semanticart

5,454
2
26
20

8

ORDER BY RAND() for MySQL ends up in horrific runtime as data increases. It's unmaintainable (depending on time requirements) even starting at just thousands of rows. – Michael May 02 '10 at 02:45
Michael brings up a great point (that is true for other DBs as well). Generally selecting random rows from large tables isn't something you want to do in a dynamic action. Caching is your friend. Rethinking what you're trying to accomplish might not be a bad idea either. – semanticart May 02 '10 at 02:52
1

Ordering RAND() in mysql on a table with about a million rows is slooooooooooooooooooooow. – Subimage Sep 08 '11 at 19:50
27

Doesn't work anymore. Use `Model.order("RANDOM()").first` instead. – phil pirozhkov Apr 21 '13 at 18:21
Slow and database specific. ActiveRecord is supposed to work seamlessly between databases so you shouldn't use this method. – Dex Dec 01 '13 at 05:23
MySQL and `.order("RAND()")` requires "Using temporary; Using filesort". – Carson Reinke Feb 21 '14 at 16:52
On postgres: use TABLESAMPLE. In a agnostic way the best might be to set up a table column random_bucket that you fill in with a random number in a range of, say, 0..999, and then when fetching N % of data from the table you randomly pick N% of the numbers bteween 0..999, and match your random_bucket column against those randomly picked numbers. This usually can use an index. – radiospiel Aug 26 '19 at 10:12

score 29 · Answer 4 · answered May 10 '11 at 01:12

29

Benchmarking these two methods on MySQL 5.1.49, Ruby 1.9.2p180 on a products table with +5million records:

def random1
  rand_id = rand(Product.count)
  rand_record = Product.first(:conditions => [ "id >= ?", rand_id])
end

def random2
  if (c = Product.count) != 0
    Product.find(:first, :offset =>rand(c))
  end
end

n = 10
Benchmark.bm(7) do |x|
  x.report("next id:") { n.times {|i| random1 } }
  x.report("offset:")  { n.times {|i| random2 } }
end


             user     system      total        real
next id:  0.040000   0.000000   0.040000 (  0.225149)
offset :  0.020000   0.000000   0.020000 ( 35.234383)

Offset in MySQL appears to be much slower.

EDIT I also tried

Product.first(:order => "RAND()")

But I had to kill it after ~60 seconds. MySQL was "Copying to tmp table on disk". That's not going to work.

answered May 10 '11 at 01:12

dkam

3,876
2
32
24

1

For those looking for more tests how long a real random approach takes: I tried `Thing.order("RANDOM()").first` on a table with 250k entries - the query finished under half a second. (PostgreSQL 9.0, REE 1.8.7, 2 x 2.66 GHz cores) That's fast enough for me, since I'm doing a one-time "cleanup". – Florian Pilz Jun 16 '11 at 13:12
6

Ruby's rand method returns one less that the specified number so you'll want `rand_id = rand(Product.count) + 1` or you'll never get the last record. – Ritchie Feb 24 '12 at 12:34
4

Note `random1` will not work if you ever delete a row in the table. (Count will be less than the max id and you'll never be able to select rows with high ids). – Nicholas Dec 09 '12 at 22:26
1

Using `random2` can be improved by an `#order` using an indexed column. – Carson Reinke Feb 21 '14 at 16:55

score 25 · Answer 5 · edited Aug 26 '20 at 20:50

25

It is not advised that you use this solution, but if for some reason you really want to randomly select a record while only making one database query, you could use the sample method from the Ruby Array class, which allows you to select a random item from an array.

Model.all.sample

This method requires only one database query, but it's significantly slower than alternatives like Model.offset(rand(Model.count)).first which require two database queries, though the latter is still preferred.

edited Aug 26 '20 at 20:50

Sajad Torkamani

544
1
7
18

answered May 31 '12 at 16:15

Ryan Atallah

2,977
26
34

11

If you have 100k rows in your database, all of these would have to be loaded into memory. – Venkat D. Nov 13 '12 at 22:10
4

Of course it is not recommended for production realtime code, but I like this solution, it is very clear to use for special situations like the _seeding_ the database with fake values. – fguillen Dec 09 '12 at 10:55
19

Please - never say never. This is a great solution for development-time debugging if the table is small. (And if you're taking samples, debugging is quite possibly the use case). – mahemoff Feb 17 '13 at 00:19
1

Im using for seeding and is good for me. In addition, Model.all.sample(n) works too :) – Arnaldo Ignacio Gaspar Véjar Aug 21 '13 at 16:09
Agree with @mahemoff, you may have a large complex table and want a sample of results, so calling something like Foo.where("json_col -> 'foo' ->> 'bar' ILIKE ?", "%baz%").sample(5) would allow you to inspect your results. – Ameet Wadhwani Nov 29 '14 at 19:17
I'm using Rails with a small set of fixtures, and because it's simple, I prefer to use this method in test cases. I dislike premature optimization. – sealocal Jan 13 '16 at 19:56
I do like this solution, but you have to be careful with it. If you have like a huge table, and if you only need a random sample, i'd probably take this logic and add `Model.last(100).sample` or something with the idea of slimming down the records. – kdweber89 Aug 22 '16 at 14:30

Niels B. · Answer 6 · 2014-03-30T14:35:05.033

19

It doesn't have to be that hard.

ids = Model.pluck(:id)
random_model = Model.find(ids.sample)

pluck returns an array of all the id's in the table. The sample method on the array, returns a random id from the array.

This should perform well, with equal probability of selection and support for tables with deleted rows. You can even mix it with constraints.

User.where(favorite_day: "Friday").pluck(:id)

And thereby pick a random user who likes fridays rather than just any user.

edited Mar 30 '14 at 14:35

answered Aug 19 '13 at 18:42

Niels B.

5,912
3
24
44

10

This is clean and works for a small table or one-time usage, just note it won't scale. On a 3M table, plucking IDs takes about 15 seconds for me on MariaDB. – mahemoff Jun 07 '14 at 14:54
2

That's a good point. Have you found an alternative solution that's faster, while maintaining the same qualities? – Niels B. Jun 21 '14 at 16:16
Doesn't the accepted offset solution maintain the same qualities? – mahemoff Jun 21 '14 at 21:34
No, it does not support conditions and does not have an equal probability of selection for tables with deleted records. – Niels B. Jun 22 '14 at 08:14
If you set conditions on the count, i don't understand why you can't have those? – mahemoff Jun 22 '14 at 08:52
1

Come to think of it, if you apply the constraints when both counting and selecting with an offset, the technique should work. I was imagining only applying it on the count. – Niels B. Jun 22 '14 at 09:24
Its possible, but definetely an answer I would only accept if it comes with a big warning. As mentioned already, it only works as long as your table contains few records, on a slightly bigger scale this performs very very bad! So just *don't* use this for getting random records. – Robin Nov 18 '19 at 19:54

spilliton · Answer 7 · 2012-05-18T02:26:22.077

14

I made a rails 3 gem to handle this:

https://github.com/spilliton/randumb

It allows you do do stuff like this:

Model.where(:column => "value").random(10)

edited May 18 '12 at 02:26

answered Aug 17 '11 at 13:43

spilliton

3,811
5
35
35

7

In the documentation of this gem they explain _"randumb simply tacks an additional `ORDER BY RANDOM()` (or `RAND()` for mysql) to your query."_ – therefore, the comments on bad performance mentioned in comments to the answer by @semanticart also apply when using this gem. But at least it's DB independent. – Nicolas Jan 16 '14 at 13:20

Sam · Answer 8 · 2018-01-02T11:34:12.397

Reading all of these did not give me a lot of confidence about which of these would work best in my particular situation with Rails 5 and MySQL/Maria 5.5. So I tested some of the answers on ~ 65000 records, and have two take aways:

RAND() with a limit is a clear winner.
Do not use pluck + sample.

def random1
  Model.find(rand((Model.last.id + 1)))
end

def random2
  Model.order("RAND()").limit(1)
end

def random3
  Model.pluck(:id).sample
end

n = 100
Benchmark.bm(7) do |x|
  x.report("find:")    { n.times {|i| random1 } }
  x.report("order:")   { n.times {|i| random2 } }
  x.report("pluck:")   { n.times {|i| random3 } }
end

              user     system      total        real
find:     0.090000   0.000000   0.090000 (  0.127585)
order:    0.000000   0.000000   0.000000 (  0.002095)
pluck:    6.150000   0.000000   6.150000 (  8.292074)

This answer synthesizes, validates and updates Mohamed's answer, as well as Nami WANG's comment on the same and Florian Pilz's comment on the accepted answer - please send up votes to them!

Knotty66 · Answer 9 · 2013-11-11T13:15:16.700

9

I use this so often from the console I extend ActiveRecord in an initializer - Rails 4 example:

class ActiveRecord::Base
  def self.random
    self.limit(1).offset(rand(self.count)).first
  end
end

I can then call Foo.random to bring back a random record.

edited Nov 11 '13 at 13:15

answered Nov 11 '13 at 12:29

Knotty66

1,711
2
12
4

1

do you need `limit(1)`? `ActiveRecord#first` should be smart enough to do that. – tokland Nov 14 '14 at 13:41
Nice simple solution! I suppose you could also put it in the default `ApplicationRecord` as another alternative. – Sajad Torkamani Aug 26 '20 at 14:35

Thomas Klemm · Answer 10 · 2016-11-22T20:34:42.073

6

One query in Postgres:

User.order('RANDOM()').limit(3).to_sql # Postgres example
=> "SELECT "users".* FROM "users" ORDER BY RANDOM() LIMIT 3"

Using an offset, two queries:

offset = rand(User.count) # returns an integer between 0 and (User.count - 1)
Model.offset(offset).limit(1)

edited Nov 22 '16 at 20:34

answered Nov 30 '12 at 15:22

Thomas Klemm

10,678
1
51
54

1

No need for -1, rand counts up to num - 1 – anemaria20 Nov 22 '16 at 17:31
Thanks, changed :+1: – Thomas Klemm Nov 22 '16 at 20:34

score 3 · Answer 11 · answered Apr 17 '18 at 19:41

3

You can use the Array method sample, the method sample returns a random object from an array, in order to use it you just need to exec in a simple ActiveRecord query that return a collection, for example:

User.all.sample

will return something like this:

#<User id: 25, name: "John Doe", email: "admin@example.info", created_at: "2018-04-16 19:31:12", updated_at: "2018-04-16 19:31:12">

answered Apr 17 '18 at 19:41

trejo08

2,348
1
13
12

I wouldn't recommend working with array methods while using AR. This way takes almost 8 times the time `order('rand()').limit(1)` does "the same" job (with ~10K records). – Sebastián Palma Apr 23 '18 at 02:16

Yuri Karpovich · Answer 12 · 2015-03-04T08:40:15.203

2

If you need to select some random results within specified scope:

scope :male_names, -> { where(sex: 'm') }
number_of_results = 10

rand = Names.male_names.pluck(:id).sample(number_of_results)
Names.where(id: rand)

edited Mar 04 '15 at 08:40

answered Feb 05 '15 at 13:20

Yuri Karpovich

382
4
10

Mendoza · Answer 13 · 2017-09-22T22:32:14.167

After seeing so many answers I decided to benchmark them all on my PostgreSQL(9.6.3) database. I use a smaller 100,000 table and got rid of the Model.order("RANDOM()").first since it was already two orders of magnitude slower.

Using a table with 2,500,000 entries with 10 columns the hands down winner was the pluck method being almost 8 times faster than the runner up(offset. I only ran this on a local server so that number might be inflated but its bigger enough that the pluck method is what I'll end up using. It's also worth noting that this might cause issues is you pluck more than 1 result at a time since each one of those will be unique aka less random.

Pluck wins running 100 time on my 25,000,000 row table Edit: actually this time includes the pluck in the loop if I take it out it it runs about as fast as simple iteration on the id. However; it does take up a fair amount of RAM.

RandomModel                 user     system      total        real
Model.find_by(id: i)       0.050000   0.010000   0.060000 (  0.059878)
Model.offset(rand(offset)) 0.030000   0.000000   0.030000 ( 55.282410)
Model.find(ids.sample)     6.450000   0.050000   6.500000 (  7.902458)

Here is the data running 2000 times on my 100,000 row table to rule out random

RandomModel       user     system      total        real
find_by:iterate  0.010000   0.000000   0.010000 (  0.006973)
offset           0.000000   0.000000   0.000000 (  0.132614)
"RANDOM()"       0.000000   0.000000   0.000000 ( 24.645371)
pluck            0.110000   0.020000   0.130000 (  0.175932)

score 2 · Answer 14 · answered Jan 17 '19 at 18:09

2

Very old question but with :

rand_record = Model.all.shuffle

You got an Array of record, sort by random order. No need gems or scripts.

If you want one record :

rand_record = Model.all.shuffle.first

answered Jan 17 '19 at 18:09

Gregdebrick

533
6
14

1

Not the best option, as this loads all records into memory. Also, `shuffle.first` == `.sample` – Andrew Rozhenko Oct 01 '19 at 11:58

score 1 · Answer 15 · answered Aug 30 '15 at 04:02

The Ruby method for randomly picking an item from a list is sample. Wanting to create an efficient sample for ActiveRecord, and based on the previous answers, I used:

module ActiveRecord
  class Base
    def self.sample
      offset(rand(size)).first
    end
  end
end

I put this in lib/ext/sample.rb and then load it with this in config/initializers/monkey_patches.rb:

Dir[Rails.root.join('lib/ext/*.rb')].each { |file| require file }

This will be one query if the size of the model is already cached and two otherwise.

Vadim Eremeev · Answer 16 · 2016-07-05T11:59:32.827

1

For MySQL database try: Model.order("RAND()").first

edited Jul 05 '16 at 11:59

answered Dec 05 '15 at 16:04

Vadim Eremeev

470
5
16

This doesnt work on mysql.. you should incloude at least what DB engine is this suppose to work with – Arnold Roa Jul 04 '16 at 14:34
Sorry, there was typo. Fixed now. Should work for mysql (only) – Vadim Eremeev Jul 05 '16 at 12:00

score 1 · Answer 17 · answered Jan 07 '16 at 09:08

1

Rails 4.2 and Oracle:

For oracle you can set a scope on your Model like so:

scope :random_order, -> {order('DBMS_RANDOM.RANDOM')}

or

scope :random_order, -> {order('DBMS_RANDOM.VALUE')}

And then for a sample call it like this:

Model.random_order.take(10)

or

Model.random_order.limit(5)

of course you could also place an order without a scope like so:

Model.all.order('DBMS_RANDOM.RANDOM') # or DBMS_RANDOM.VALUE respectively

answered Jan 07 '16 at 09:08

mahatmanich

10,791
5
63
82

You can do this with postgres with `order('random()'` and MySQL with `order('rand()')` as well. This is definitely the best answer. – jrochkind Apr 20 '16 at 18:27

score 1 · Answer 18 · answered Nov 06 '16 at 18:40

If you're using PostgreSQL 9.5+, you can take advantage of TABLESAMPLE to select a random record.

The two default sampling methods (SYSTEM and BERNOULLI) require that you specify the number of rows to return as a percentage of the total number of rows in the table.

-- Fetch 10% of the rows in the customers table.
SELECT * FROM customers TABLESAMPLE BERNOULLI(10);

This requires knowing the amount of records in the table to select the appropriate percentage, which may not be easy to find quickly. Fortunately, there is the tsm_system_rows module that allows you to specify the number of rows to return directly.

CREATE EXTENSION tsm_system_rows;

-- Fetch a single row from the customers table.
SELECT * FROM customers TABLESAMPLE SYSTEM_ROWS(1);

To use this within ActiveRecord, first enable the extension within a migration:

class EnableTsmSystemRowsExtension < ActiveRecord::Migration[5.0]
  def change
    enable_extension "tsm_system_rows"
  end
end

Then modify the from clause of the query:

customer = Customer.from("customers TABLESAMPLE SYSTEM_ROWS(1)").first

I don't know if the SYSTEM_ROWS sampling method will be entirely random or if it just returns the first row from a random page.

Most of this information was taken from a 2ndQuadrant blog post written by Gulcin Yildirim.

score 0 · Answer 19 · edited May 23 '17 at 12:10

0

I'm brand new to RoR but I got this to work for me:

 def random
    @cards = Card.all.sort_by { rand }
 end

It came from:

How to randomly sort (scramble) an array in Ruby?

edited May 23 '17 at 12:10

Community

1
1

answered Nov 07 '13 at 04:31

Aaron Pennington

566
5
12

4

The bad thing about it is that it's going to load all cards from the database. It's more efficient to do it inside database. – Anton Kuzmin Nov 09 '13 at 10:28
You can also shuffle arrays with `array.shuffle`. Anyway, beware, as `Card.all` will load all card records into memory, which gets more inefficient the more objects we are talking about. – Thomas Klemm Nov 11 '13 at 13:20

score 0 · Answer 20 · edited Oct 21 '22 at 12:08

0

What about writing:

rand_record = Model.find(Model.pluck(:id).sample)

This make what you're doing clear.

edited Oct 21 '22 at 12:08

Mendoza

82
7

answered Oct 12 '17 at 19:47

poramo

546
5
7

rld · Answer 21 · 2017-12-29T18:40:16.007

I try this of Sam's example on my App using rails 4.2.8 of Benchmark( I put 1..Category.count for random, because if the random takes a 0 it will produce an error(ActiveRecord::RecordNotFound: Couldn't find Category with 'id'=0)) and the mine was:

 def random1
2.4.1 :071?>   Category.find(rand(1..Category.count))
2.4.1 :072?>   end
 => :random1
2.4.1 :073 > def random2
2.4.1 :074?>    Category.offset(rand(1..Category.count))
2.4.1 :075?>   end
 => :random2
2.4.1 :076 > def random3
2.4.1 :077?>   Category.offset(rand(1..Category.count)).limit(rand(1..3))
2.4.1 :078?>   end
 => :random3
2.4.1 :079 > def random4
2.4.1 :080?>    Category.pluck(rand(1..Category.count))
2.4.1 :081?>
2.4.1 :082 >     end
 => :random4
2.4.1 :083 > n = 100
 => 100
2.4.1 :084 > Benchmark.bm(7) do |x|
2.4.1 :085 >     x.report("find") { n.times {|i| random1 } }
2.4.1 :086?>   x.report("offset") { n.times {|i| random2 } }
2.4.1 :087?>   x.report("offset_limit") { n.times {|i| random3 } }
2.4.1 :088?>   x.report("pluck") { n.times {|i| random4 } }
2.4.1 :089?>   end

                  user      system      total     real
find            0.070000   0.010000   0.080000 (0.118553)
offset          0.040000   0.010000   0.050000 (0.059276)
offset_limit    0.050000   0.000000   0.050000 (0.060849)
pluck           0.070000   0.020000   0.090000 (0.099065)

Linh Dam · Answer 22 · 2018-04-27T12:20:17.163

.order('RANDOM()').limit(limit) looks neat but is slow for large tables because it needs to fetch and sort all rows even if limit is 1 (internally in database but not in Rails). I'm not sure about MySQL but this happens in Postgres. More explanation in here and here.

One solution for large tables is .from("products TABLESAMPLE SYSTEM(0.5)") where 0.5 means 0.5%. However, I find this solution is still slow if you have WHERE conditions that filter out a lot of rows. I guess it's because TABLESAMPLE SYSTEM(0.5) fetch all rows before WHERE conditions apply.

Another solution for large tables (but not very random) is:

products_scope.limit(sample_size).sample(limit)

where sample_size can be 100 (but not too large otherwise it's slow and consumes a lot of memory), and limit can be 1. Note that although this is fast but it's not really random, it's random within sample_size records only.

PS: Benchmark results in answers above are not reliable (at least in Postgres) because some DB queries running at 2nd time can be significantly faster than running at 1st time, thanks to DB cache. And unfortunately there is no easy way to disable cache in Postgres to make these benchmarks reliable.

score 0 · Answer 23 · answered Feb 10 '19 at 22:59

Along with using RANDOM(), you can also throw this into a scope:

class Thing
  scope :random, -> (limit = 1) {
    order('RANDOM()').
    limit(limit)
  }
end

Or, if you don't fancy that as a scope, just throw it into a class method. Now Thing.random works along with Thing.random(n).

score 0 · Answer 24 · answered Nov 01 '20 at 13:35

0

You can get array of all ids and then return random element with sample method.

Model.ids.sample

answered Nov 01 '20 at 13:35

Dragonn steve

11
1
1

score 0 · Answer 25 · answered Sep 24 '21 at 14:06

If you want to run benchmarks on your database of choice, here is a template:

gem 'activerecord', git: 'https://github.com/rails/rails'
gem 'sqlite3'
gem 'benchmark'

require 'active_record'
require 'benchmark'

ActiveRecord::Base.establish_connection(adapter: 'sqlite3', database: ':memory:')

ActiveRecord::Schema.define do
  create_table :users
end

class User < ActiveRecord::Base
  def self.sample_random
    order('RANDOM()').first
  end

  def self.sample_pluck_id_sample
    find(pluck(:id).sample)
  end

  def self.sample_all_sample
    all.sample
  end

  def self.sample_offset_rand_count
    offset(rand(count)).first
  end
end

USERS_COUNTS = [1000, 10_000, 100_000, 1_000_000]
N = 100

USERS_COUNTS.each do |count|
  puts "Creating #{count} users"

  User.insert_all((1..count).map { |id| { id: id } })

  Benchmark.bm do |x|
    x.report("sample_random") { N.times { User.sample_random } }
    x.report("sample_offset_rand_count") { N.times { User.sample_offset_rand_count } }
    if count < 10_000
      x.report("sample_pluck_id_sample") { N.times { User.sample_pluck_id_sample } }
      x.report("sample_all_sample") { N.times { User.sample_all_sample } }
    end
  end

  puts "Deleting #{User.count} users"

  User.delete_all
end

score 0 · Answer 26 · answered Aug 26 '22 at 14:35

Random sorting by RDBMS can be quite expensive if there are millions of records. To simplify this, you can limit the number of sorted records such way (PostgreSQL syntax):

class ApplicationRecord < ActiveRecord::Base
  def self.sample
    where(
      "id >= TRUNC(RANDOM() * (SELECT MAX(id) FROM #{table_name}) + 1)"
    ).order(:id).first
  end
end

And then User.sample

This will work more randomly in the case of ids uniform distribution

score 0 · Answer 27 · answered Jul 14 '23 at 22:44

0

I found an answer! The most powerful way to get random record is to use this gem!

Use it easy like:

User.random

Or

User.random.limit(10)

Without any configuraion!

https://github.com/the-rubies-way/random-rails

answered Jul 14 '23 at 22:44

Иван Марыныч

1

score -2 · Answer 28 · answered Feb 24 '20 at 20:42

Depending of the meaning of "random" and what you actually want to do, take could be enough.

By the "meaning" of random I mean:

Do you mean give me any element I don't care it's position? then it is enough.
Now, if you mean "give me any element with a fair probability that repeated experiments will give me different elements from the set" then, force the "Luck" with any of the methods mentioned in the other answers.

Example, for testing, sample data could have been created randomly anyways, so take is more than enough, and to be honest, even first.

https://guides.rubyonrails.org/active_record_querying.html#take

Random record in ActiveRecord

28 Answers28

Rails 7

Rails 6

Rails 5, 4

Linked