0

Hi I wondered how to select a random set of 2% of all rows. I know that we can do it in such a way: 'LIMIT 100'. But the challenge here is that the table size changes, but I always want 2% of total rows by random, either rounding up or down is fine. The table has unique id for each row. Does anyone know how to do it?

SET @size := CEILING(0.02*(SELECT COUNT(*) FROM orders));

SELECT 
    *
FROM 
    `orders`
ORDER BY RAND()
LIMIT (SELECT @size)

I have tried using session variable, but the last line doesn't seem to work (LIMIT (SELECT @size))

JessZhu
  • 15
  • 1
  • 5
  • 1
    What version of MySQL are you using? – Nick Aug 07 '19 at 07:11
  • Can you use a prepared statement instead? `select * from orders order by rand() limit ?` – Schwern Aug 07 '19 at 07:34
  • Possible duplicate of [Using variable in a LIMIT clause in MySQL](https://stackoverflow.com/questions/245180/using-variable-in-a-limit-clause-in-mysql) – Schwern Aug 07 '19 at 07:35

1 Answers1

0

In MYSQL 8+, you can use the ROW_NUMBER() function to give a row number to each row, then select only rows with row numbers less than the maximum row number * 0.02 (i.e. 2% of rows). For example:

WITH cte AS 
(SELECT *, ROW_NUMBER() OVER (ORDER BY RAND()) AS rn
 FROM orders),
cte2 AS
(SELECT MAX(rn) AS numrows FROM cte)
SELECT *
FROM cte
JOIN cte2
WHERE rn < numrows * 0.02

Here's a demo on dbfiddle using a numbers table.

In MySQL 5.x you can emulate the ROW_NUMBER() function using variables or a JOIN. The answers to this question gives good examples of how to do this.

Nick
  • 138,499
  • 22
  • 57
  • 95
  • Got it! Thank you. I am using MySQL 5.7 to accommodate Sequel Pro. A simulated ROW_NUMBER() seems to slove my questions. – JessZhu Aug 07 '19 at 08:32