297

I am using MSSQL Server 2005. In my db, I have a table "customerNames" which has two columns "Id" and "Name" and approx. 1,000 results.

I am creating a functionality where I have to pick 5 customers randomly every time. Can anyone tell me how to create a query which will get random 5 rows (Id, and Name) every time when query is executed?

Kate Gregory
  • 18,808
  • 8
  • 56
  • 85
djmzfKnm
  • 26,679
  • 70
  • 166
  • 227
  • 2
    Depends on how much randomness you want. See: http://msdn.microsoft.com/en-us/library/aa175776(SQL.80).aspx for comparison of NEW_ID versus RAND() – Shannon Severance Jul 30 '09 at 23:36
  • Random is not a common requirement for a Database, I was surprised to find a [link](http://www.petefreitag.com/item/466.cfm) for some SQL – Paxic Feb 24 '09 at 06:20

14 Answers14

814
SELECT TOP 5 Id, Name FROM customerNames
ORDER BY NEWID()

That said, everybody seems to come to this page for the more general answer to your question:

Selecting a random row in SQL

Select a random row with MySQL:

SELECT column FROM table
ORDER BY RAND()
LIMIT 1

Select a random row with PostgreSQL:

SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1

Select a random row with Microsoft SQL Server:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

Select a random row with IBM DB2

SELECT column, RAND() as IDX 
FROM table 
ORDER BY IDX FETCH FIRST 1 ROWS ONLY

Select a random record with Oracle:

SELECT column FROM
( SELECT column FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1

Select a random row with sqlite:

SELECT column FROM table 
ORDER BY RANDOM() LIMIT 1
Curtis Tasker
  • 11,115
  • 2
  • 23
  • 23
  • 31
    Does this become very expensive on large tables, where each row gets a random number, and then a large unindexed random number set is sorted? – Andrey Apr 19 '14 at 16:04
  • This is perhaps obvious to most people, but it wasn't obvious to me... the following query will not get a new random value for each row: `update tbl_vouchers set tbl_UsersID = (select top(1) id from tbl_Users order by NEWID())` - edit: I can't get formatting to work in comments :( – Mir Dec 10 '15 at 18:35
  • Why does this fail on Google Cloud SQL? We only get partially random results. Nearly 80% of the time we get the same row back. – Praxiteles Nov 25 '16 at 22:38
  • 6
    Warning: For big databases this method will have a bad performance. Can you imagine the time it will take to generate a random value for each row if the database have a million of entry? You can have more information about and a better alternativ [here](https://www.warpconduit.net/2011/03/23/selecting-a-random-record-using-mysql-benchmark-results/). – Francis Ngueukam Dec 15 '16 at 09:33
  • Thanks for the solution. Just wanted to know if we can assign some kind of a declare variable to the value after the 'limit' keyword. I am trying to find solutions in bigquery but haven't had much luck yet. – Ajay Kumar Jan 20 '20 at 09:52
  • In my case, i use ORDER BY NEWID() in sql server, but it caused to daedlock. I found out and identified the problem here: https://www.sqlservercentral.com/forums/topic/deadlock-due-to-intra-query-parallelism – Lê Duy Thứ Apr 27 '23 at 07:45
40
SELECT TOP 5 Id, Name FROM customerNames ORDER BY NEWID()
Cody Caughlan
  • 32,456
  • 5
  • 63
  • 68
16

In case someone wants a PostgreSQL solution:

select id, name
from customer
order by random()
limit 5;
Barry Brown
  • 20,233
  • 15
  • 69
  • 105
11

Maybe this site will be of assistance.

For those who don't want to click through:

SELECT TOP 1 column FROM table
ORDER BY NEWID()
10

I have found this to work best for big data.

SELECT TOP 1 Column_Name FROM dbo.Table TABLESAMPLE(1 PERCENT);

TABLESAMPLE(n ROWS) or TABLESAMPLE(n PERCENT) is random but need to add the TOP n to get the correct sample size.

Using NEWID() is very slow on large tables.

TylerH
  • 20,799
  • 66
  • 75
  • 101
Billy
  • 331
  • 3
  • 5
8

There is a nice Microsoft SQL Server 2005 specific solution here. Deals with the problem where you are working with a large result set (not the question I know).

Selecting Rows Randomly from a Large Table http://msdn.microsoft.com/en-us/library/cc441928.aspx

JohnC
  • 2,687
  • 1
  • 22
  • 30
7

This is an old question, but attempting to apply a new field (either NEWID() or ORDER BY rand()) to a table with a large number of rows would be prohibitively expensive. If you have incremental, unique IDs (and do not have any holes) it will be more efficient to calculate the X # of IDs to be selected instead of applying a GUID or similar to every single row and then taking the top X # of.

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];

DECLARE @randomId1 int, @randomId2 int, @randomId3 int, @randomId4 int, @randomId5 int
SET @randomId1 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId2 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId3 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId4 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId5 = ((@maxValue + 1) - @minValue) * Rand() + @minValue

--select @maxValue as MaxValue, @minValue as MinValue
--  , @randomId1 as SelectedId1
--  , @randomId2 as SelectedId2
--  , @randomId3 as SelectedId3
--  , @randomId4 as SelectedId4
--  , @randomId5 as SelectedId5

select * from [TABLE] el
where el.id in (@randomId1, @randomId2, @randomId3, @randomId4, @randomId5)

If you wanted to select many more rows I would look into populating a #tempTable with an ID and a bunch of rand() values then using each rand() value to scale to the min-max values. That way you do not have to define all of the @randomId1...n parameters. I've included an example below using a CTE to populate the initial table.

DECLARE @NumItems int = 100;

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];
DECLARE @range int = @maxValue+1 - @minValue;

with cte (n) as (
   select 1 union all
   select n+1 from cte
   where n < @NumItems
)
select cast( @range * rand(cast(newid() as varbinary(100))) + @minValue as int) tp
into #Nt
from cte;

select * from #Nt ntt
inner join [TABLE] i on i.id = ntt.tp;

drop table #Nt;
Protiguous
  • 89
  • 2
  • 9
RIanGillis
  • 619
  • 1
  • 6
  • 15
  • @Protiguous, the edit you proposed broke the random selection. Using min() and max() applied to the dbo.Tally64k table would not allow the user to select a row with a pk id > 65556. – RIanGillis Sep 23 '19 at 13:41
  • The table name change was simply an artifact from testing. The actual table name doesn't matter, as long as the correct table is used. min() and max() can both be queried in one query rather than two, which is what I was trying to show. – Protiguous Jan 14 '20 at 21:48
  • @Protiguous Ah, I see that now, I was confused because you used the 0-65k when doing the min-max but not later. After your most recent edit I actually wanted to ask you about the performance implications of the changes you made, as performance tuning is one of my interests and seemingly meaningless decisions like which side of the equals sign you place something can actually have a significant impact --- Would the same thing apply to the 5 SET @randomId## calls? Or is that different because it is not SELECTing FROM an actual table? – RIanGillis Jan 15 '20 at 04:49
  • I'm not sure I understand your question. Are you asking why there are 5 SET instead of just 1 SELECT @id1=rand(), @id2=rand().. ? It's because multiple calls to a rand() in 1 statement will produce the same result, hence the separated SET. (rand() on SQL Server is a deterministic function, I believe.) I would *guess* that 1 select vs 5 set is in the nanosecond range performance-wise. – Protiguous Mar 06 '20 at 23:43
6
SELECT * FROM TABLENAME ORDER BY random() LIMIT 5; 
Pang
  • 9,564
  • 146
  • 81
  • 122
Narendra
  • 967
  • 15
  • 29
  • SELECT * FROM ( SELECT * FROM table ORDER BY DBMS_RANDOM.VALUE) WHERE rownum < number; @Bear try this – Narendra Oct 22 '18 at 14:05
6

If you have a table with millions of rows and care about the performance, this could be a better answer:

SELECT * FROM Table1
WHERE (ABS(CAST(
  (BINARY_CHECKSUM
  (keycol1, NEWID())) as int))
  % 100) < 10

https://msdn.microsoft.com/en-us/library/cc441928.aspx

Tohid
  • 6,175
  • 7
  • 51
  • 80
  • Note that this will select approximately 10% of the rows in the table. If you need to select an exact number of rows, or at least N rows, this approach won't work. – LarsH Jun 03 '19 at 14:49
3

In order to shuffle the SQL result set, you need to use a database-specific function call.

Note that sorting a large result set using a RANDOM function might turn out to be very slow, so make sure you do that on small result sets.

If you have to shuffle a large result set and limit it afterward, then it's better to use something like the Oracle SAMPLE(N) or the TABLESAMPLE in SQL Server or PostgreSQL instead of a random function in the ORDER BY clause.

So, assuming we have the following database table:

Song database table

And the following rows in the song table:

| id | artist                          | title                              |
|----|---------------------------------|------------------------------------|
| 1  | Miyagi & Эндшпиль ft. Рем Дигга | I Got Love                         |
| 2  | HAIM                            | Don't Save Me (Cyril Hahn Remix)   |
| 3  | 2Pac ft. DMX                    | Rise Of A Champion (GalilHD Remix) |
| 4  | Ed Sheeran & Passenger          | No Diggity (Kygo Remix)            |
| 5  | JP Cooper ft. Mali-Koa          | All This Love                      |

Oracle

On Oracle, you need to use the DBMS_RANDOM.VALUE function, as illustrated by the following example:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY DBMS_RANDOM.VALUE

When running the aforementioned SQL query on Oracle, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| JP Cooper ft. Mali-Koa - All This Love            |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |

Notice that the songs are being listed in random order, thanks to the DBMS_RANDOM.VALUE function call used by the ORDER BY clause.

SQL Server

On SQL Server, you need to use the NEWID function, as illustrated by the following example:

SELECT
    CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY NEWID()

When running the aforementioned SQL query on SQL Server, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |
| JP Cooper ft. Mali-Koa - All This Love            |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |

Notice that the songs are being listed in random order, thanks to the NEWID function call used by the ORDER BY clause.

PostgreSQL

On PostgreSQL, you need to use the random function, as illustrated by the following example:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY random()

When running the aforementioned SQL query on PostgreSQL, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |

Notice that the songs are being listed in random order, thanks to the random function call used by the ORDER BY clause.

MySQL

On MySQL, you need to use the RAND function, as illustrated by the following example:

SELECT
  CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY RAND()

When running the aforementioned SQL query on MySQL, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |

Notice that the songs are being listed in random order, thanks to the RAND function call used by the ORDER BY clause.

Vlad Mihalcea
  • 142,745
  • 71
  • 566
  • 911
1

If you are using large table and want to access of 10 percent of the data then run this following command: SELECT TOP 10 PERCENT * FROM Table1 ORDER BY NEWID();

Palash Mondal
  • 468
  • 4
  • 10
0

If you use Yandex Database then you should use

select column from table order by random (TableRow()) limit 1;
John
  • 451
  • 6
  • 21
0

If you don't want to use NEWID() and the primary key column is int, then you can just select a random primary key like this:

with a as 
(
select count(id) as row_count
from mytable
)

select *
from mytable , a
where id = round(rand() * row_count, 0)
Muzib
  • 2,412
  • 3
  • 21
  • 32
0

If you need just to shuffle sequential values then you don't need always to use random (as it's non sql standard), you can try to use some tricks, like using reverse(PK)

SELECT PK FROM products ORDER BY REVERSE(concat('', PK))

So let's say we have a values: 123 124 125 223 224 225 323 324 325

then after reverse we will see them in the following order:

321 - 1
322 - 4
323 - 7
421 - 2
422 - 5
423 - 8
521 - 3
522 - 6
523 - 9
Yura
  • 1,733
  • 1
  • 20
  • 19