2

I have found a similar question: What is the order of execution for this SQL statement. The accepted answer specifies that FROM is executed prior to any other clauses.

SELECT StudentName, Students.StudentID, Cnt 
FROM (
    SELECT Students.StudentID, count(StudentCourses.CourselD) as [Cnt]
    FROM Students LEFT JOIN StudentCourses
    ON Students.StudentID = StudentCourses.StudentID GROUP BY Students.StudentID
) T INNER JOIN Students on T.StudentID = Students.StudentID

How does count() work in this case?

I am confused because it does not seem to follow my logic. Assuming that LEFT JOIN in nested FROM is executed first (please let me know if this is wrong), We will have at least one row for each student as an intermediate output, possibly with a value of null if a student is not taking any courses. If count() is applied to that output, such students will have 1 counted due to null valued row, which does not happen in the actual result. Could anyone explain how this query gets executed under the hood?

Thanks in advance

Community
  • 1
  • 1
Jas
  • 77
  • 1
  • 7
  • what's your question exactly? that query works fine because is valid. besides that you are making a wrong assumption: you do not know the order of execution of the statements so you cannot say for sure that `FROM` statements are always executed first. **logical order != execution order** – Paolo Sep 25 '14 at 13:33
  • I see that the query is valid, but I am not sure how & when count() is applied. I've made some clarifications in the original question, please let me know if you find my wording confusing. – Jas Sep 25 '14 at 13:56

2 Answers2

0

Your problem is not down to order of execution. I think it is more due to the fact that

COUNT(StudentCourses.CourselD)

will only count non-null results.

Having this as:

SELECT Students.StudentID, count(*) as [Cnt]
FROM Students LEFT JOIN StudentCourses
ON Students.StudentID = StudentCourses.StudentID GROUP BY Students.StudentID 

Would return a count of all the rows, not just those where course ID was not null.

Paddy
  • 33,309
  • 15
  • 79
  • 114
  • So the rule of thumb is, use count(*) to count all matching rows, and count(column) to count those with that column != null? – Jas Sep 25 '14 at 14:12
0

COUNT() is in the subquery, not in the outer query. You only select the value returned by count() in the outer query.

The value of Cnt column in the subquery will be an integer greater or equal to 0 (since COUNT(NULL) returns 0, and not all students have rows in StudentCourses table, I believe you'll have zeros in that column). The subquery returns one row for each student, with COUNT() value of 0 or more.

Besides, as Paolo mentioned in the comments - logical order != execution order, so do not assume what is executed before what. Database engines have multiple mechanisms to optimise query execution, so you should prefer reading the docs to assuming how something works.

Kleskowy
  • 2,648
  • 1
  • 16
  • 19