0

I have a table with information about flights between cities that looks like this:

origin_city dest_city   time    
Dothan AL   Atlanta GA    171       
Dothan AL   Elsewhere AL    2      
Boston MA   New York NY     5    
Boston MA   City MA         5    
New York NY Boston MA       5        
New York NY Poughkipsie NY  2

I want to return, for each origin city, the maximum flight time and the destination city or cities that go with it. So the results would look like this:

Dothan AL    Atlanta GA    171
Boston MA    New York NY   5
Boston MA    City MA       5
New York NY  Boston MA     5

Based on other threads, like Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause, I wrote a query that returns the max flight time from each origin city:

SELECT DISTINCT   F.origin_city AS origin_city, Max(F.actual_time) AS actual_time
FROM     Flights AS F
GROUP BY F.origin_city
ORDER BY F.origin_city

But I get various errors when I try to include the destination(s) that match that time. For example:

SELECT DISTINCT   F.origin_city AS origin_city, Max(F.actual_time) AS actual_time, F.dest_city AS dest_city
FROM     Flights AS F
GROUP BY F.origin_city
HAVING
    (MAX(F.actual_time) = F.actual_time)
ORDER BY F.origin_city
GO

... gives me the error "Column 'Flights.actual_time' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause." I tried adding F.actual_time to the group by, but then I get tons of results because I get the max flight time for each origin/destination pair (I think). Also, the actual_time field should be contained in an aggregate function: Max.

This code:

SELECT F.origin_city AS origin_city, Max(F.actual_time) AS actual_time, F.dest_city AS dest_city
FROM Flights as F
WHERE F.actual_time IN (SELECT MAX(actual_time) FROM Flights AS F2 GROUP BY F2.origin_city) 
GROUP BY F.origin_city

gives a similar error: Column 'Flights.dest_city' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

How can I return all 3 columns correctly?

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
hemera
  • 23
  • 7
  • If you just googled your title with 'site:stackoverflow.com sql' then you would find that this is a faq. [ask] PS Learn what the sequence of SQL clauses does in the description of what result it describes. Group by partitions into groups, then MAX of a non-grouping column in HAVING or SELECT is the maximum in the group & a grouping column has the same value in every row of a group. Thus what you wrote doesn't make sense. (If you wanted an original column's value available as both grouped & not, you could select it under 2 names before grouping, which is yet another answer way to your post.) – philipxy Feb 21 '19 at 21:22

2 Answers2

0

You want RANK() :

SELECT F.*
FROM (SELECT F.*,
             RANK() OVER (PARTITION BY origin_city ORDER BY time DESC) AS SEQ
      FROM Flights AS F
     ) F
WHERE SEQ = 1;

If i go with your version then, i would do :

SELECT F.*
FROM Flights as F
WHERE F.actual_time = (SELECT MAX(F2.actual_time) 
                       FROM Flights AS F2 
                       WHERE F2.origin_city = F.origin_city
                      ); 

Your GROUP BY is really unnecessary for both outer & subquery, just pass outer query reference (F.origin_city) in subquery (F2.origin_city) to make it correlated subquery.

Yogesh Sharma
  • 49,870
  • 5
  • 26
  • 52
0

You can join the query you already have to the table Flights matching the origin_city column and actual_time from Flights to max_actual_time from the query:

SELECT F.*   
FROM Flights AS F INNER JOIN (
SELECT 
  origin_city, 
  Max(actual_time) AS max_actual_time
FROM Flights
GROUP BY origin_city) AS T
ON T.origin_city = F.origin_city AND T.max_actual_time = F.actual_time
ORDER BY F.origin_city
forpas
  • 160,666
  • 10
  • 38
  • 76