1

I have a database with the following tables:

student(sid, name,address)

course(cid,type,department)

takes(sid,cid,score)

The query which i need to solve is Find the students such that, for each course they take, their score on the course is higher than any other student who has taken that same course.

I know I have to take a self join in such cases but still confused how to solve such queries !

Prahalad Gaggar
  • 11,389
  • 16
  • 53
  • 71
Crusaderpyro
  • 2,163
  • 5
  • 29
  • 53

4 Answers4

1
SELECT
    s.sid,
    s.name
FROM student s
WHERE NOT EXISTS (
    SELECT *
    FROM takes t1
    JOIN takes t2 ON (t1.cid = t2.cid AND t1.sid <> t2.sid AND t2.score > t1.score)
    WHERE t1.sid = s.sid
);

We're selecting all students, for whom we can't find any course they took where someone else had a higher score.

Sqlfiddle

SQB
  • 3,926
  • 2
  • 28
  • 49
  • Wouldn't it be more simple to find the max score in `takes`, grouped by `cid` and then join this with the information of `student` to get the name? – Aaron Digulla Nov 15 '13 at 09:26
  • 1
    @AaronDigulla No, because that would just get the highest scoring student per course. If I understand OP correctly, OP wants to find all students (if any) that are the best in _every_ course they take. – SQB Nov 15 '13 at 09:30
0

Try this:

SELECT s.* 
FROM student s
INNER JOIN (
  SELECT t1.*
  FROM takes t1
    LEFT JOIN takes t2
      ON (t1.cid = t2.cid AND t1.score < t2.score)
  WHERE t2.cid IS NULL
  ) b ON b.sid = s.sid

sqlfiddle demo

You can see in the fiddle that if two users have the same score that is the best, both will be reurned.

You find the row with the best score in table takes, and you JOIN Student with it

(I started using this way to get the max value from a table based on this answer from Bill karwin. You can check his detailed explanation on that answer on how it works).

Community
  • 1
  • 1
Filipe Silva
  • 21,189
  • 5
  • 53
  • 68
  • Your example shouldn't return any rows, since there is no student that scored the highest in all courses they took. – SQB Nov 15 '13 at 09:53
  • @SQB. The requirement is "their score on the course is higher than any other student who has taken that same course." – Filipe Silva Nov 15 '13 at 09:54
  • The _full_ requirement is "find the students such that, **for each course they take**, their score on the course is higher than any other student who has taken that same course" (emphasis mine). Otherwise you could just select the highest scoring students, grouping by course. – SQB Nov 15 '13 at 09:56
  • What is the part in "for each course they take" that makes you assume that he has to be the best in all courses? – Filipe Silva Nov 15 '13 at 10:02
  • Well, exactly that. In each and every course they take, their score needs to be (one of) the highest. So if we have a student that took courses A, B, and C and scored highest in A and B, but not in C, we don't want that student. – SQB Nov 15 '13 at 10:08
  • Well. I don't think that's what he means, but at this point we'll have to agree to disagree and see what OP really wants when he comes back. :p – Filipe Silva Nov 15 '13 at 10:10
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/41295/discussion-between-sqb-and-filipe-silva) – SQB Nov 15 '13 at 22:28
0

SQB's answer solves your problem perfectly. I just give you another way of doing it. It can make a big difference in performance depending on how big your tables are and what indexes are available.

I select the maximum scores per course and join them with all takes. Then I compare the number of all courses taken by the student with the number of courses he/she has achieved highest score in. Thus the table is being read completely two times and the results are somehow glued together and then filtered. This can (but doesn't have to) be faster than having to search higher grading students for every record found in takes.

select *
from students
where sid in
(
  select takes.sid
  from takes
  left outer join (select cid, max(score) as score from takes group by cid) max_scores 
  on (max_scores.cid=takes.cid and max_scores.score=takes.score)
  group by takes.sid
  having count(takes.cid) = count(max_scores.cid)
);
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
0

If the requirement is "for each course, find the student(s) who got the highest mark" then it can be done with this query:

WITH ranked_scores AS (
  SELECT sid,
         cid,
         score,
         DENSE_RANK() OVER ( PARTITION BY cid ORDER BY score DESC ) AS "rank"
  FROM   takes
)
SELECT s.sid, s.name, r.cid, r.score
FROM   student s
       INNER JOIN
       ranked_scores r
       ON ( r.sid = s.sid )
WHERE  r."rank" = 1;

SQLFIDDLE

If the requirement is to "find the student(s) who got the highest mark of the course in all courses they took" then you can use this:

WITH ranked_scores AS (
  SELECT sid,
         cid,
         score,
         DENSE_RANK() OVER ( PARTITION BY cid ORDER BY score DESC ) AS "rank"
  FROM   takes
)
SELECT s.sid,
       s.name
FROM   student s
WHERE  EXISTS ( SELECT 1
                FROM   ranked_scores r
                WHERE  r.sid = s.sid
                GROUP BY r.sid
                HAVING MAX( r."rank" ) = 1 );

SQLFIDDLE

MT0
  • 143,790
  • 11
  • 59
  • 117