1

For example, I got a table:

Name  , Area,   Age
cat   ,   1 ,    10
dog   ,   2 ,     7
cat   ,   3 ,    11
horse ,   4 ,     4
cat   ,   5 ,    10
dog   ,   6 ,     9

When I group by 'Name', for each group I want to retain the 'Area' of the MAX 'Age' in that group. In this example, I wanna get:

Name , Area
cat  ,  3
dog  ,  6
horse,  4

How should I do this in one query? Thanks guys!

xiaolong
  • 3,396
  • 4
  • 31
  • 46

3 Answers3

2

Try something like this:

SELECT name, area FROM mytable
JOIN (
    SELECT name, MAX(age) as maxage FROM mytable
    GROUP BY name
) AS `max`
ON  mytable.name = max.name AND mytable.age = max.maxage

This first selects the name and MAX(age) in a subquery, and then joins them to the original table so that you can get the area associated with the MAX(age). By using join ie inner join, we insure that any results in the original table that had nothing to match do not show.

Notice that you can't do something like:

SELECT name, MAX(age), area FROM mytable

Because area would be randomly selected from all the area values in the group. It wouldn't know which area you want. You might think that it would get you the area in the same row as the MAX(age), but it will not. It doesn't actually know that is what you want.

Damien Black
  • 5,579
  • 18
  • 24
2

This would be more slightly more efficient than the subquery approach mentioned by @deroby and @DamienBlack:

SELECT t1.name, t1.area
FROM myTable t1
LEFT JOIN myTable t2 
  ON t1.name = t2.name AND t2.Age > t1.Age
WHERE t2.some_primary_key IS NULL
ORDER BY t1.name

Note that this requires some column known to contain a value that is not NULL (such as a primary key). You can substitute, t2.some_primary_key with any other non-null, indexed column as needed.

SQLFiddle based off @deroby's here.

PinnyM
  • 35,165
  • 3
  • 73
  • 81
  • Although the execution plan on MSSQL (++) seems to agree with you, the result of this query isn't quite correct (yet). [++: not a clue on how to compare this on mysql] – deroby Feb 10 '14 at 21:13
  • That is a clever solution as well. I wonder what the performance difference is. – Damien Black Feb 10 '14 at 21:17
  • Indeed, plugging `SELECT t1.Name, t1.Area FROM test t1 LEFT JOIN test t2 ON t1.Name = t2.Name AND t2.Age > t1.Age WHERE t2.Name IS NULL` into the sqlFiddle returns the same information now. +1 =) – deroby Feb 10 '14 at 21:17
  • 1
    @DamienBlack - see [this answer](http://stackoverflow.com/a/8749095/1141839) from Bill Karwin on this point. – PinnyM Feb 10 '14 at 21:20
  • On MSSQL the query plans differ quite a bit. The question off course is how (if?) indexing and volumes might affect this. [QueryPlan](https://dl.dropboxusercontent.com/u/747955/Stuff%20shared%20in%20forums%20%28do%20NOT%20delete%2C%20keep%20small%21%29/SO-QueryPlan.png) – deroby Feb 10 '14 at 21:23
  • @deroby - as this question applies to MySQL, you can refer to the link I referenced above. To be sure, there should be a compound index on name and age for this to work efficiently... – PinnyM Feb 10 '14 at 21:41
  • I know, but I simply don't have mysql here, nor know how to compare the performance of queries. Somehow I would assume non-exotic queries perform (relatively) alike in the different products. FYI: After indexing the difference between both queries is rather similar ... on MSSQL =) [QueryPlan w Indexes](https://dl.dropboxusercontent.com/u/747955/Stuff%20shared%20in%20forums%20%28do%20NOT%20delete%2C%20keep%20small%21%29/SO-QueryPlanIndexed.png) – deroby Feb 10 '14 at 21:42
  • You would need a larger set of data before you saw a material difference. But you should know that every DBMS query planner is _not_ alike even for commonly used strategies - [here is one interesting example](http://stackoverflow.com/questions/2246772/whats-the-difference-between-not-exists-vs-not-in-vs-left-join-where-is-null/2246793#2246793). – PinnyM Feb 10 '14 at 21:51
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/47193/discussion-between-deroby-and-pinnym) – deroby Feb 10 '14 at 21:54
0

Seems I need to learn to type faster =)

Anyway, I came up with this; you can test it in this sqlFiddle

 SELECT t.Name, t.Area
  FROM (SELECT Name, Max(Age) as Max_Age
          FROM test
         GROUP BY Name) mx
  JOIN test t
    ON t.Name = mx.Name
   AND t.Age  = mx.Max_Age
 ORDER BY t.Name
deroby
  • 5,902
  • 2
  • 19
  • 33
  • thanks, this is almost the same query with @DamienBlack's. Worked well. – xiaolong Feb 10 '14 at 22:23
  • Yeah, it pretty much IS the same, but at least he explained what it does, and typed faster =). Then again, I have a sqlFiddle =P. Anyway, please do take PinnyM's comments into account about performance. It may work great now on 100 records but cause headaches once you're a couple of months into production with bunch-loads of records. – deroby Feb 10 '14 at 22:47