I was taught and heard that in sql/mysql, items in select
clause must appear in the GROUP BY clause or be used in an aggregate function as in here
However, the example below may have changed my mind.
Two tables: Student (sid is the key)
sid | name | email
========================
99901| mike | mike@a.edu
99902| jane | jane@b.edu
99903| peter| pete@b.edu
Took (sid+oid together is the key, oid stands for offering id)
sid | oid| grade
==================
99901| 1 | 100
99901| 2 | 30
99901| 3 | 40
99902| 4 | 100
99902| 5 | 100
99902| 6 | 40
99903| 6 | 95
Question: I want to find the sid, name and average grade of each student who has taken at least 2 courses.
Answer:
select s.sid, name, avg(grade) as average
from Student as s, Took as t
where s.sid = t.sid
group by s.sid
having count(*) >= 2;
Result:
sid | name | avgerage
=======================
99901| mike | 56.6667
99902| jane | 80.0000
Based on must appear in the GROUP BY clause or be used in an aggregate function, the query should have been incorrect because name
is neither in group clause nor an aggregate function.
I looked some posts and this, my understanding is that although name
is neither in group clause nor an aggregate function, we group by sid
which is the key and each sid
only correspond to one name
, so it won't return multiple options from which sql doesn't know which one to return. To confirm my understanding, if I select one more column email
, it's still ok; but if I select oid
, it gives error because each sid
corresponds to more than one oid
.
Could someone correct my understanding if it is wrong or elaborate more on this statement: must appear in the GROUP BY clause or be used in an aggregate function
Thanks.
First Edit:
Btw, I tested in MySQL 8.0.17
Second Edit:
Just a summary of useful links when you read the answers/comments below.