3

I have a SQL Server database with three tables: Trips, Slices, and Legs.

Each Trip has a one to many relationship with Slices and Slices has a one to many relationship with Legs.

Trips represents a full trip, a slice represents only the outbound or return portions of a trip, and the legs represent all the stops in either outbound or return slices.

I want to be able to find all the trips with matching legs.

Here's look at the tables:

Trips:

tripId  saleTotal   queryDate
1       $200            6/10/2015
2       $198            6/11/2015

Slices:

sliceId     connections duration    tripIdFK
1           1           50          1 
2           1           45          1 
3           0           60          2 
4           1           85          2 

Legs:

legId   carrier flightNumber    departureAirport    departureDate   ArrivalAirport  ArrivalDate sliceIDFK
1       AA      1               JFK                 7/1/2015        LON             7/2/2015    1 
2       AA      2               LON                 7/2/2015        FRA             7/2/2015    1 
3       AA      11              FRA                 7/10/2015       LON             7/10/2015   2
4       AA      12              LON                 7/10/2015       JFK             7/10/2015   2 

5       UA      5               EWR                 8/1/2015        LAX             8/1/2015    3
6       UA      6               LAX                 8/5/2015        ORD             8/5/2015    4 
7       UA      7               ORD                 8/5/2015        EWR             8/5/2015    4

How would I be able to find all the trips where the all the carrier and flight numbers match such as in legId 1-4 by searching departureAirport/arrivalAirport (JFK/FRA)?

In other words, legId 1-4 is one unit with the details for Trip 1 and legId 5-7 is another unit with the details for Trip 2. I need to find which other trips match exactly legId 1-4 details (except for PK and FK), etc. Any help would be greatly appreciated!!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
elesk01s
  • 131
  • 6
  • Which RDBMS are you using? What have you tried? – Politank-Z Jun 16 '15 at 20:10
  • MSSQL 2012. I have tried a few queries but haven't got close to solving it that's why I didn't bother posting them. – elesk01s Jun 16 '15 at 20:25
  • Showing what you've tried, and perhaps how you thought it would help and how it didn't help might guide anyone trying to answer you. – Politank-Z Jun 16 '15 at 20:30
  • The best thing you can do to get some help is to setup a test environment so others can work on the code for a solution. The best place to do that is at sqlfiddle.com – Sean Lange Jun 16 '15 at 20:42
  • I'll be my normal pedant self and point out that your database is not normalized, since `Slices.connections` can be derived from `Legs`. – Alvin Thompson Jun 16 '15 at 20:53
  • That's correct Alvin. I put it there only so it can help me understand the data a little faster. – elesk01s Jun 16 '15 at 20:56

3 Answers3

2

Hope this helps.

Just pass the base TripId (in @BaseTripID) with which you want to compare other records. I assume that you are concerned only about carrier,flightNumber,departureAirport,ArrivalAirport to match exactly with any other trip regardless of date fields.

create table Trips(tripId int,saleTotal int,queryDate date)
create table Slices(sliceId int ,connections int,duration int ,tripIdFK int)
create table Legs(legId int, carrier char(2), flightNumber int, departureAirport char(3), departureDate date, ArrivalAirport char(3), ArrivalDate date, sliceIDFK int)


insert into Trips values(1,200,'6/10/2015'),(2,198,'6/11/2015'),(3,300,'6/15/2015'),(4,200,'6/21/2015')
insert into Slices values(1,1,50,1),(2,1,45,1),(3,0,60,2),(4,1,85,2),(5,1,50,3),(6,1,45,3),(7,1,45,4),(8,1,45,4)
insert into Legs values(1,'AA',1,'JFK','7/1/2015','LON','7/2/2015',1) ,
(2,'AA',2,'LON','7/2/2015','FRA','7/2/2015',1),
(3,'AA',11,'FRA','7/10/2015','LON','7/10/2015',2),
(4,'AA',12,'LON','7/10/2015','JFK','7/10/2015',2),
(5,'UA',5,'EWR','8/1/2015','LAX','8/1/2015',3),
(6,'UA',6,'LAX','8/5/2015','ORD','8/5/2015',4),
(7,'UA',7,'ORD','8/5/2015','EWR','8/5/2015',4),
(8,'AA',1,'JFK','7/11/2015','LON','7/12/2015',5),
(9,'AA',2,'LON','7/12/2015','FRA','7/12/2015',5),
(10,'AA',11,'FRA','7/20/2015','LON','7/20/2015',6),
(11,'AA',12,'LON','7/20/2015','JFK','7/20/2015',6),
(12,'AA',1,'JFK','7/1/2015','LON','7/2/2015',7) ,
(13,'AA',2,'LON','7/2/2015','FRA','7/2/2015',7),
(14,'AA',11,'FRA','7/10/2015','BEL','7/10/2015',8),
(15,'AA',12,'BEL','7/10/2015','JFK','7/10/2015',8)

--select * from  Trips
--select * from  Slices
--select * from  Legs

-------------------------------------------------------------------

Declare @BaseTripID int = 1, @Legs int ,@MatchingTripID int

declare @BaseTrip table(carrier char(2), flightNumber int, departureAirport char(3), ArrivalAirport char(3),row_no int)
declare @MatchingTrip table(carrier char(2), flightNumber int, departureAirport char(3), ArrivalAirport char(3),row_no int,legid int,tripid int)


insert into @BaseTrip
select carrier, flightNumber, departureAirport, ArrivalAirport,ROW_NUMBER() over(order by l.legId) 
from Legs l join slices s on s.sliceId = l.sliceIDFK
where s.tripIdFK = @BaseTripID


select @Legs=count(*) from @BaseTrip


Insert into @MatchingTrip
select carrier, flightNumber, departureAirport, ArrivalAirport,ROW_NUMBER() over(partition by s.tripIdFK order by l.legId) as row_no,l.legId,s.tripIdFK
from Legs l join slices s on s.sliceId = l.sliceIDFK
and s.tripIdFK in 
(select s.tripIdFK
from Legs l join slices s on s.sliceId = l.sliceIDFK
and s.tripIdFK <> @BaseTripID
Group by s.tripIdFK having count(l.legId)=@Legs)


select @MatchingTripID = m.tripid
from @MatchingTrip m join @BaseTrip b 
on m.carrier = b.carrier
and m.flightNumber = b.flightNumber
and m.departureAirport = b.departureAirport
and m.ArrivalAirport = b.ArrivalAirport
and m.row_no = b.row_no
GROUP BY m.tripid HAVING COUNT(*) = @Legs


select s.tripIdFK as matchingTripID,l.legid,l.carrier,l.flightNumber,l.departureAirport,l.ArrivalAirport  
from Legs l 
join Slices s on s.sliceId = l.sliceIDFK
where s.tripIdFK = @MatchingTripID


---------------------
drop table Trips
drop table Slices
drop table Legs

making use of leg count is the key.Thus we are eliminating any matches other than completely identical legs(trip 4 with just two matching legs). So Now we get only Trip 3 as matching records.

please note that we are also excluding trips, that has any additional legs besides the matching ones. I hope this is what you expect, pair of Perfectly identical trips.

Jasqlg
  • 183
  • 1
  • 9
  • Interesting solution! One note however: it would be more efficient if instead of having to eliminate the base trip from `@MatchingTrip` on the join with `and m.tripid <> @BaseTripID`, the base trip wasn't in `@MatchingTrip` in the first place. How about instead adding `where t.tripId <> @BaseTripId` when inserting into `@MatchingTrip`? – Alvin Thompson Jun 17 '15 at 12:17
  • Another note: There's no reason to join against `Trips` at all, since the only thing you need is the trip ID, and you already have that in `Slices` (as `tripIdFK`). – Alvin Thompson Jun 17 '15 at 12:20
  • Hmm...Can you verify that this won't show all trips with at least one matching leg, instead of showing just the trips where all legs are identical? For example, how do you prevent a trip from LON to FRA from showing up if it uses the same flight? Also, how do you eliminate a trip from JFK to BEL (Belgium), if the first two legs are the same, but there's a third leg from FRA to BEL? – Alvin Thompson Jun 17 '15 at 12:32
  • You get serious points in my book for a creative solution, though. – Alvin Thompson Jun 17 '15 at 12:33
  • Alvin is right that this solution will show all the trips where at least one leg is the same. From the example above, in tripId = 3, if you change the last flight number from 12 to 13, the query will show the first three legs but not the fourth. I need all the legs to be the same. Interesting take on the problem though. – elesk01s Jun 17 '15 at 14:09
  • Thanks for the comments Alvin. Yes I missed that part. I have done a small work around to fix it and Edited my answer. if you guys are still interested, please check. – Jasqlg Jun 17 '15 at 15:17
  • `having` clause! clever! – Alvin Thompson Jun 18 '15 at 01:03
  • I tried the edited query and I only get the last matching trip instead of a list of matching trips. Thanks for the effort Jasqlg! – elesk01s Jun 18 '15 at 14:13
  • I think this should be simple. It is because we capture @MatchingTripID in a scalar variable. It can hold only one value. To get all the values make it as table variable and change the select assignment to insert statement. And change the 'where clause' in final query to something like this.... where s.tripIdFK in (select * from @MatchingTripID). – Jasqlg Jun 19 '15 at 05:26
  • @Jasqlg - Yes that did the trick. I'm able to get all the matching trips and validated it using Alvin's method (although yours is faster). I made the changes you suggested but nobody can see them until they're peer reviewed. I'm not sure if the changes I made are visible to you, but if they are, please accept them so others can make use out of them. Thanks for you help with this! It's really made a difference with what i'm trying to do. – elesk01s Jun 23 '15 at 14:03
1

Ow, my brain hurts...

Replace all of the question marks (3 of them) with the trip ID of the trip where you want to check for similar trips.

select distinct s.tripIDFK as tripId
from Legs l
left join Slices s on l.sliceIDFK = s.sliceId
where s.tripIDFK != ?
and not exists (
  select carrier, flightNumber, departureAirport, departureDate
  from Legs l2
  left join Slices s2 on l2.sliceIDFK = s2.sliceId
  where s2.tripIDFK = s.tripIDFK
  except
  select carrier, flightNumber, departureAirport, departureDate
  from Legs l2
  left join Slices s2 on l2.sliceIDFK = s2.sliceId
  where s2.tripIDFK = ?
)
and not exists (
  select carrier, flightNumber, departureAirport, departureDate
  from Legs l2
  left join Slices s2 on l2.sliceIDFK = s2.sliceId
  where s2.tripIDFK = ?
  except
  select carrier, flightNumber, departureAirport, departureDate
  from Legs l2
  left join Slices s2 on l2.sliceIDFK = s2.sliceId
  where s2.tripIDFK = s.tripIDFK
)
order by s.tripIDFK

The meat of the query is the and not exists clauses. They get the leg data for one trip and effectively subtracts the leg data of another trip using the except clause. If you're left with nothing, then the second trip data contains all of the first trip data. You have to run the and not exists clause twice (with the operands reversed) to ensure that the two sets of trip data are truly identical, and that one is not merely a subset of the other.

This is in no way scalable to large numbers of rows.

Alvin Thompson
  • 5,388
  • 3
  • 26
  • 39
  • This worked! Very clever way of solving the problem but like you said it's not very scalable. However, I'm still getting enough data to be able to analyze it further. Thanks for your help! – elesk01s Jun 17 '15 at 19:02
0

Another approach is to determine/store a legsKey with each slice so you can find matching slices across trips.

For each slice, your legsKey is each leg's carrier and flight number appended, which you can do using For XML Path, like this

something like:

select
   distinct  
    stuff((
        select ',' + l.carrier + ':' + l.flightNumber
        from legs l
        where l.carrier = carrier and l.flightnumber = flightnumber
        order by l.carrier, l.flightnumber
        for xml path('')
    ),1,1,'') as legsList
from legs
group by carrier, flightnumber
Community
  • 1
  • 1
Beth
  • 9,531
  • 1
  • 24
  • 43