Because the book is wrong.
The columns in the group by
have only one relationship to the columns in the select
according to the ANSI standard. If a column is in the select
, with no aggregation function, then it (or the expression it is in) needs to be in the group by
statement. MySQL actually relaxes this condition.
This is even useful. For instance, if you want to select rows with the highest id for each group from a table, one way to write the query is:
select t.*
from table t
where t.id in (select max(id)
from table t
group by thegroup
);
(Note: There are other ways to write such a query, this is just an example.)
EDIT:
The query that you are suggesting:
select EMP_ID, SALARY
from EMPLOYEE_PAY_TBL
group by BONUS;
would work in MySQL but probably not in any other database (unless BONUS
happens to be a poorly named primary key on the table, but that is another matter). It will produce one row for each value of BONUS
. For each row, it will get an arbitrary EMP_ID
and SALARY
from rows in that group. The documentation actually says "indeterminate", but I think arbitrary is easier to understand.
What you should really know about this type of query is simply not to use it. All the "bare" columns in the SELECT
(that is, with no aggregation functions) should be in the GROUP BY
. This is required in most databases. Note that this is the inverse of what the book says. There is no problem doing:
select EMP_ID
from EMPLOYEE_PAY_TBL
group by EMP_ID, BONUS;
Except that you might get multiple rows back for the same EMP_ID
with no way to distinguish among them.