0

I've a flights table that consists of few columns but somebody seem to have ran a migration twice that resulted in creation of same data twice.

Anyway, the flight should only have only data from the following condition: The flight_number and the date. Basically the table is looking like this at the moment:

flight_number date
123 2021-09-16
123 2021-09-16
123 2021-09-17
124 2021-09-18
124 2021-09-18

Result I want:

flight_number date
123 2021-09-16
123 2021-09-17
124 2021-09-18

Basically, keep only one and remove duplicated (if the flight_number is same of the same date).

I'm looking for a DELETE SQL query but couldn't find the one like I am looking for. What is the query that can help me achieve it? Thanks!

EDIT: Yes, all the data has a column id that is unique even if the data is same.

Nish Dekardo
  • 329
  • 2
  • 11

1 Answers1

0

You need to identify which rows to keep and which to remove; this can be done as such:

delete ff from 
flight ff
inner join (


select flight_number, row_number() over (partition by flight_number order by date) as RN
from flight f

) dups
on ff.flight_number = dups.flight_number
where dups.rn > 1

Basically, this uses Row_Number to create a row identifier based on certain criteria, in this case, for each (partition) Flight_number, create a row number then delete any records where the row_number is > 1.

You will need to change this to use the actual ID column on the join, like this https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=58a4ac7235ea22b557116ad68c8449c3

Aron
  • 765
  • 6
  • 14