4

I have a table of questions set out like so..

id | question | answer | syllabus | difficulty

I want to create an SQL statement that selects 5 questions at random for each of the distinct syllabuses when the difficulty is easy.

So if there are 4 syllabuses I would have 20 questions.

I was thinking something like this...

   SELECT 
    * 
FROM 
    questions 
WHERE 
    difficulty='easy' 
AND 
    syllabus 
IN 
(
    SELECT DISTINCT 
        syllabus 
    FROM 
        questions 
    WHERE 
        difficulty='easy'
) 
LIMIT 
(5*
    (
    SELECT 
        COUNT(DISTINCT syllabus) 
    FROM 
        questions 
    WHERE 
        difficulty='easy'
    )

But this doesn't return 5 from each of the distinct syllabuses only the correct number of questions from any syllabus.

Purplegoldfish
  • 5,268
  • 9
  • 39
  • 59
TrueWheel
  • 997
  • 2
  • 20
  • 35

1 Answers1

1

This would work, but would be really slow:

SELECT * FROM questions WHERE difficulty='easy' ORDER BY RAND() LIMIT 5;

A better way is to select 5 ID's first, and then retrieve the rows corresponding to those ID's. Selecting the ID's themselves can be done with a WHERE ID > RAND(0,MAX(ID)), but if there are gaps, then your data will be skewed.

A better alternative is discussed here, but requires more effort: Simple Random Samples from a Sql database

Community
  • 1
  • 1
Konerak
  • 39,272
  • 12
  • 98
  • 118