I have table name m_option
:
m_option_id m_student_id value 1 1 5 2 1 5 3 1 6 4 1 7 5 2 1 6 2 2 7 2 3 8 2 3 9 2 4
I want to get the 2 rows with min value
for each m_student_id
:
m_option_id m_student_id value 1 1 5 2 1 5 5 2 1 6 2 2
I have table name m_option
:
m_option_id m_student_id value 1 1 5 2 1 5 3 1 6 4 1 7 5 2 1 6 2 2 7 2 3 8 2 3 9 2 4
I want to get the 2 rows with min value
for each m_student_id
:
m_option_id m_student_id value 1 1 5 2 1 5 5 2 1 6 2 2
You can use the row_number
window function for that:
SELECT m_option_id, m_student_id, value
FROM (
SELECT
m_option_id, m_student_id, value,
row_number() OVER (PARTITION BY m_student_id ORDER BY value)
FROM m_option
) t
WHERE
row_number <= 2;
row_number
will calculate the number of each row within its group. We then use that number to filter the the top 2 rows (i.e. lowest value
) from each group.
Alternatively, you could use a LATERAL
subquery:
SELECT m_option_id, m_student_id, value
FROM (SELECT DISTINCT m_student_id FROM m_option) s,
LATERAL (
SELECT m_option_id, value
FROM m_option
WHERE s.m_student_id=m_student_id
ORDER BY value
LIMIT 2
) t;
This will go through all distinct values of m_student_id
and for each one of them will find the top 2 rows using a LATERAL
subquery.
Assuming there can be many rows per student in table m_option
, the key to performance is index usage. And that's most efficient if you have a separate student
table listing all students uniquely (which you would typically have). Then:
SELECT m.m_option_id, s.student_id AS m_student_id, m.value
FROM student s
, LATERAL (
SELECT m_option_id, value
FROM m_option
WHERE m_student_id = s.student_id -- PK of table student
ORDER BY value
LIMIT 2
) m;
A multicolumn index on m_option
makes this fast:
CREATE INDEX m_option_combo_idx ON m_option (m_student_id, value);
If you can get index-only scans out of it, append the column m_option_id
as last index item:
CREATE INDEX m_option_combo_idx ON m_option (m_student_id, value, m_option_id)
Index columns in this order.
Distilling a unique list of student_id
from m_option
would incur an expensive sequential scan over m_option
and void any performance benefit.
This excludes students without any related rows in m_option
. Use LEFT JOIN LATERAL () ON true
to include such students in the result (extended with NULL
values for the missing option):
If you do not have a student
table, the other fast option is a recursive CTE.
Detailed explanation for either variant: