0

I'm trying to find out the pitstop strategy followed by teams in F1 and which strategy on average yields the better winning results. However every time I try to execute the query I just get redundant data.

Dataset Example:

DriverTable

DriverId DriverRef DriverNumber
1 Lewis 22
2 Alonso 14
3 Max 1

RaceResult

ResultID RaceID DriverID Number Position Laps
1 12 1 22 1 53
1 12 2 14 6 53
1 12 3 1 2 53
2 13 1 22 2 57
2 13 2 14 6 57
2 13 3 1 1 57

Races

RaceID year CircuitID Name
12 2009 1 Monza
13 2013 2 Bahrain Sakhir

PitStops

RaceID DriverID Stop Lap
12 1 1 17
12 1 2 34
12 2 1 14
12 2 2 42
12 3 1 20
12 3 2 37
13 1 1 14
13 1 2 32
13 2 1 12
13 2 2 34
13 3 1 20
13 3 2 42

My desired result table would look something similar to this.

StrategyChoices

DriverRef RaceID CircuitID Name Stop Lap
Lewis 12 1 Monza 1 17
Lewis 12 1 Monza 2 34
Max 13 2 Bahrain Sakhir 1 20
Max 13 2 Bahrain Sakhir 2 42

The goal here find out what pit strategy did the winning driver use on a certain track.

SELECT ra.year, ra.name, d.properdriverref, (SELECT DISTINCT re.number FROM results WHERE re.position = 1), p.stop, p.lap, re.position
FROM pit_stops p
JOIN results re ON re.raceId = p.raceId
JOIN races ra ON p.raceId = ra.raceId
JOIN DriversXL d ON p.driverId = d.driverId    
WHERE ra.year >= 2018 AND re.position = 1

Was the code that I used and the data set returned a

DriverRef RaceID CircuitID Name Stop Lap
Lewis 12 1 Monza 1 17
Lewis 12 1 Monza 2 34
Max 12 1 Monza 1 17
Max 12 1 Monza 2 34
Alonso 12 1 Monza 1 17
Alonso 12 1 Monza 2 34
Max 13 2 Bahrain Sakhir 1 20
Max 13 2 Bahrain Sakhir 2 42
Lewis 13 2 Bahrain Sakhir 1 20
Lewis 13 2 Bahrain Sakhir 2 42
Alonso 13 2 Bahrain Sakhir 1 20
Alonso 13 2 Bahrain Sakhir 2 42
Dale K
  • 25,246
  • 15
  • 42
  • 71
OOHomie
  • 11
  • 3
  • Nothing appears incorrect with your query based on the info provided. Simple check would be to use ```SELECT *``` and then comment out each join one by one, until your number of rows are correct. If I had to guess, there's a composite key and you missed the second column or perhaps may just have a wrong table , like maybe you should be using ```driver``` and not ```DriverXL``` – Stephan Mar 23 '22 at 23:14
  • I have attempted to re-create a whole new dataset just for the winners but the problem is that its showing me a lot of redundant data. I tried using the distinct clause but to no avail so I'm kinda stuck now – OOHomie Mar 24 '22 at 00:38
  • Please highlight which are the redundant data – Squirrel Mar 24 '22 at 01:03
  • So I am unaware on how to format a comment like a post, but for the data, it shows all the pitstops that the drivers took on a certain track and not the winning driver which yields unnecessary data – OOHomie Mar 24 '22 at 01:47
  • Does this answer your question? [Two SQL LEFT JOINS produce incorrect result](https://stackoverflow.com/questions/12464037/two-sql-left-joins-produce-incorrect-result) – philipxy Mar 24 '22 at 02:24
  • Is the that the actual query ? the result columns doesn't really matches with the sample result that you have shown – Squirrel Mar 24 '22 at 03:22
  • Double check the `join` condition between the table. Like `pit_stops` and `results`. You only join on `raceId`. Why `driverId` is not in the join condition – Squirrel Mar 24 '22 at 03:44
  • the sub-query is rather weird `(SELECT DISTINCT re.number FROM results WHERE re.position = 1)` What are you trying to do here ? Why don't you just return `re.number` directly since you already have `result` table in the `FROM` tables – Squirrel Mar 24 '22 at 03:45
  • Thank you everyone for participating the answer seemed to have answered my problem. To answer the questions above: I tried using distinct to show the driver number where their finishing position was first. I believe it was a join issue, if anyone has any type of resources for a proper join tutorial it'll be much appreciated! – OOHomie Mar 24 '22 at 14:35

1 Answers1

0

You are missing an extra join column between pit_stop and results.

It's unclear the point of that subquery, so I have removed it

SELECT
  d.driverref,
  r.raceId,
  r.circuitId,
  ra.year,
  ra.name,
  p.stop,
  p.lap
FROM pit_stops p
JOIN results re ON re.raceId = p.raceId AND re.DriverID = p.DriverID
JOIN races ra ON p.raceId = ra.raceId
JOIN DriversXL d ON p.driverId = d.driverId    
WHERE re.position = 1;

db<>fiddle

Charlieface
  • 52,284
  • 6
  • 19
  • 43