4

When I read this post: SQL: Group by minimum value in one field while selecting distinct rows

the following solution did not make sense to me and I wonder how it can possibly work:

SELECT id, min(record_date), other_cols 
FROM mytable
GROUP BY id

This does NOT work on my database, and it somehow seems illogical to me (although it does work in the SQL fiddle example). My DBMS complains that

Column must be group column: other_cols

I am using MaxDB - is this a DBMS specific issue?

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
abulhol
  • 171
  • 1
  • 7

3 Answers3

3

It works in some RDBMSs like MySQL. MySQL docs says:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group.

But in case of SQL Server, you need to include all fields in the GROUP BY clause. Otherwise, it will report an error like:

Column 'ColName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

See MSDN.

Raging Bull
  • 18,593
  • 13
  • 50
  • 55
  • This is quite amazing and unexpected - thank you for the very helpful answers and explanations! – abulhol May 30 '14 at 09:24
  • @abulhol: Feel free to accept the answer if you are satisfied. So that it could help others in future. If you don't know how to accept an answer, please [**read this**](http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work). – Raging Bull May 30 '14 at 09:28
2

It depends on which RDBMS you are using, I know that SQL Server will not allow this because you need to perform a summary operation (MIN, MAX, AVERAGE etc) on each field that is not in the Group By clause or add the field to the Group By clause.

When you run the fiddle using MySql you can see from the example that it works and appears to be returning the first value of other_cols for each set.

From your question you are using an RDBMS that does not support this syntax (like SQL Server).

connectedsoftware
  • 6,987
  • 3
  • 28
  • 43
0

It's SQL standard. Instead you can use:

SELECT mytable.id, mytable_id_with_min_date.min_record_date, other_cols 
FROM mytable JOIN
  (
    SELECT id, min(record_date) min_record_date
    FROM mytable
    GROUP BY id
  ) mytable_id_with_min_date ON (mytable_id_with_min_date.id = mytable.id AND mytable.record_date = mytable_id_with_min_date.min_record_date)

Maybe your code looks logical but values of which row would you put into other_cols?

  • I just noticed that actually the two options do not give the same results! Only if the other_cols are identical for each group. If they are different, you must use the subselect! – abulhol Jun 06 '14 at 08:43