1

I have a relation that has airline routes and the airports that these flights go over on such routes. I'm trying to identify what routes skip over the same airports.

I have whittled down the relation into a table that should be possible to manipulate to create my desired result:

SELECT route_id, airport_id
FROM routes_airports
WHERE stops = false
ORDER BY route_id, airport_id;

I would like to match routes to one another when they have the same values for airport_id across their entries in the table while including the routes that exhibit this property.

So, for example, routes 5 and 7 both skip over airports 10,15,20, so they should be matched together, but not with say, route 10 which only skips over airports 10 and 20.

 route_id | skipped_airport_id
----------+------------
        1 |         76
        2 |         21*
        2 |         22*
        4 |         42
        5 |         21*
        5 |         22*
        7 |         15
        7 |         16
        7 |         17
        7 |         18
        7 |         46
        9 |         26
       11 |         19
       14 |         45*
       14 |         46*
       14 |         47*
       15 |         45*
       15 |         46*
       15 |         47*
       17 |         78
       20 |         20

I would like the above example data to result in a table with just the routes that have a match such as below.

 route_id 
----------
        2 
        5
       14
       15
almater
  • 27
  • 4
  • No need for `GROUP BY` when no aggregate functions are involved. If you want do eliminate duplicate rows, simply do `SELECT DISTINCT` – jarlh Apr 04 '19 at 07:30

3 Answers3

0

Try something like this:

SELECT route_id, STRING_AGG(airport_id, ',') AS airports
FROM routes_airports
WHERE stops = FALSE
GROUP BY route_id
ORDER BY 2

This will collect the airport_ids into a single column, and ORDER BY that column.

Usagi Miyamoto
  • 6,196
  • 1
  • 19
  • 33
0
WITH
  skips AS
(
  SELECT route_id, STRING_AGG(airport_id, ',' ORDER BY airport_id) AS airport_ids
  FROM routes_airports
  WHERE stops = false
  GROUP BY route_id
)
SELECT airport_ids, STRING_AGG(route_id, ',' ORDER BY route_id) AS route_ids
FROM skips
GROUP BY airport_ids
HAVING COUNT(*) > 1
MatBailie
  • 83,401
  • 18
  • 103
  • 137
0

You can do this by aggregating all skipped airports into an array and then find out those routes where those arrays are the same:

with skipped as (
  select route_id, array_agg(skipped_airport_id order by skipped_airport_id) skipped_airports
  from routes_airports
  where stops = false
  group by route_id
) 
select s1.*
from skipped s1
where exists (select * 
              from skipped s2
              where s1.route_id <> s2.route_id
                and s1.skipped_airports = s2.skipped_airports);

This returns:

route_id | skipped_airports
---------+-----------------
       2 | {21,22}         
       5 | {21,22}         
      14 | {45,46,47}      
      15 | {45,46,47}      

Online example: https://rextester.com/MJPJ90714