0

I am a beginner in SQL. I have the following 'animal' table:

RACE  NAME   AGE
dog   Medor  10
dog   Arthur 17
dog   Pitou  9
cat   Kitty  11
cat   Milo   5
cat   Loki   7
dog   Jack   14

I want to write a query to get the maximum age of each race, as well as the name of the associated animal. I tried:

SELECT race, name, MAX(age) FROM animal GROUP BY race

It does perform the per-race grouping, and does return the maximum age of each race, however, the name of the animal in each race seems random. Basically, it returns

RACE  NAME   MAX(age)
dog   Medor  17
cat   Kitty  11

whereas I would like it to return

RACE  NAME   MAX(age)
dog   Arthur 17
cat   Kitty  11

I read about ONLY_FULL_GROUP_BY and understood SQL will select any value for the NAME column, whereas I expected it to select the value associated with the maximum age.

The only other way I can imagine would be to do a sub-query, but this seems very complicated.

SELECT race, name, age FROM animal B WHERE age = (SELECT MAX(age) FROM animal A WHERE A.race = B.race)

I am surprised there is not a simpler solution. I have two questions:

  • is it correct my original query should not work (ie. produce the expected result) in the first place?
  • is there a better way than doing a subquery?

I'm using MySQL 8.x.

andynewman
  • 51
  • 5
  • 1
    "...however, the name of the animal in each race seems random..." -- Yes, that's true, because the query is malformed, MySQL does not complain and runs it. – The Impaler Jun 24 '22 at 18:17
  • If you were to use an aggregate like `AVG()` instead of `MAX()` (assuming none of the rows contains exactly the average value), or if you were to use both `MAX()` and `MIN()` in the same select-list, which row should it intuitively return? – Bill Karwin Jun 24 '22 at 18:59

1 Answers1

1

The typical solution is to use ROW_NUMBER() to identify the rows you want.

For example:

select *
from (
  select a.*,
    row_number() over(partition by race order by age desc) as rn
  from animal a
) x
where rn = 1
The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • This is a fast reply! Thank you. Still surprised a more simple solution does not exist to select row-wide data when a single aggregate is provided. I think there is room for improvement in SQL! – andynewman Jun 24 '22 at 18:23
  • avoid using select *. here in particular, it will include rn which doesn't seem useful – ysth Jun 24 '22 at 18:40
  • @andynewman There's a `DISTINCT ON` clause in PostgreSQL that servers this purpose in a rather simpler way. However, that's not part of the SQL Standard and is not implemented in MySQL. – The Impaler Jun 24 '22 at 19:30