1

I have the following 2 tables

  1. academic_session(id, start_date, end_date)
  2. student (id, name ...)

The registration table contains information of student registration to a course for an academic session.

  1. registration (id, student, academic_session, course,....... )

If I execute this sql I get the following result. Which is correct.

select student_course_registration.student, student_course_registration.course, 
academic_session.id academic_session, academic_session.start_date, academic_session.end_date
from student_course_registration, academic_session  where student_course_registration.student=101366
and student_course_registration.academic_session=academic_session.id

enter image description here

Now I am trying to fetch the latest record of student's enrolment. I wrote the following SQL -

select 
student_course_registration.student,
student_course_registration.academic_session,
max(academic_session.start_date)
from student_course_registration, academic_session  where student_course_registration.student=101366
and student_course_registration.academic_session=academic_session.id group by student_course_registration.student

The output is -

enter image description here

In the output, max() function is fetching the max date from academic session table but the academic_session id is wrong. it should be 95. Could anyone please help me?

Many thanks.

Roy
  • 61
  • 1
  • 8

2 Answers2

2

You need a query that can extract details from data with the greatest value.

Start with a subquery to use MAX ... GROUP BY to find the latest session start date for each student. Something like this should do the trick. It gives a result set with the student and the latest session.

       SELECT student_course_registration.student,
              MAX(academic_session.start_date) start_date
         FROM student_course_registration
         JOIN academic_session
              ON student_course_registration.academic_session=academic_session.id 
        GROUP BY student_course_registration.student

Then, JOIN to that subquery to retrieve the details you want.

SELECT student_course_registration.student,
       academic_session.id academic_session,
       academic_session.start_date
  FROM student_course_registration
  JOIN academic_session
         ON student_course_registration.academic_session=academic_session.id
  JOIN ( /* here is the subquery */
       SELECT student_course_registration.student,
              MAX(academic_session.start_date) start_date
         FROM student_course_registration
         JOIN academic_session
              ON student_course_registration.academic_session=academic_session.id 
        GROUP BY student_course_registration.student
   ) maxdate ON academic_session.start_date = maxdate.start_date
            AND student_course_registration.student = maxdate.student
 WHERE student_course_registration.student=101366

Don't worry about performance: SQL is made for this kind of thing. If your query is slower than you want you can add indexes.

Pro tip Use FROM a JOIN b rather than FROM a, b. The former is obsolete syntax.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
1

You are using the aggregation max() on date (while grouping by student)

The result for the column academic_session is therefore unpredictable IF THAT VALUE is not unique within the selected aggregation. (Run it a few hundred times, and the result may change)

MSSQL does not allow this (for reasons) - mysql allows it, and returns "something" - which might be wrong. Wrong in terms of: "That value is in no way related to your max(date) column, they only have the same student in common"

You either have to aggregate that column as well - or use a different query.

Your statement would be correct, if you consider this missing aggregation, and use group_concat to get the "possible options" for that column:

select 
student_course_registration.student,
group_concat(student_course_registration.academic_session),
max(academic_session.start_date)
from student_course_registration, academic_session  where student_course_registration.student=101366
and student_course_registration.academic_session=academic_session.id group by student_course_registration.student

but mind, that column would still be grouped by student, ignoring any relation to the start_date.

ps.: To fetch a result with the latest date: use ORDER BY date DESC LIMIT 0,1 - and you avoid all the troubles of grouping here:

select 
student_course_registration.student,
student_course_registration.academic_session,
academic_session.start_date
from student_course_registration, academic_session  where student_course_registration.student=101366
and student_course_registration.academic_session=academic_session.id
ORDER BY academic.session.start_date DESC
LIMIT 0,1

pps.: You need to use aggregations, if you would like to fetch that information for 40 students in one query. (grouping by student).

To fetch the latest information for a single student, using order and limit is sufficent (and way faster)

dognose
  • 20,360
  • 9
  • 61
  • 107