This is a topic, which has been discussed several times in here, but for me it's difficult to understand how to delete duplicated rows in our MySQL database. Yes, I have seen many examples, but they are simplified to 2 columns etc. and the solution is most of the time just query, which has not been explained. As I want to securely clean our database. I have backed up my table, and now I wan't to delete duplicated rows (as I have many of them). Below is a snippet of query, which I have tried, and it is fairly simple: if duplicated rows, pick latest row and show ID of the row.
SELECT MAX(id) FROM Raw_Validated_backup GROUP BY col1, col2, col3, ... col_n-1 having COUNT(*) > 1;
Now, as I have selected latest row of duplicated rows, I 'just' have to delete the duplicated rows. Easier said that done, if you ask me. This is a query written by @Jose Rui Santos:
delete test
from test
inner join (
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email
) duplic on duplic.email = test.email
where test.id < duplic.lastId;
He replied to this post: MySQL delete duplicate records but keep latest
Question is, how do I exactly delete duplicated rows and keep the latest? Prefer to be in a pseudo like format, so others can learn from this as well. For example:
DELETE table FROM database
And not specific like:
DELETE email FROM city
MySQL Queries
EXAMPLE 1:
DELETE t1 FROM Raw_Validated_backup AS t1 INNER JOIN Raw_Validated_backup AS t2
ON t1.time_start=t2.time_start
AND t1.time_end=t2.time_end
AND t1.first_temp_lpn=t2.first_temp_lpn
AND t1.first_WL=t2.first_WL
AND t1.first_temp_lpn_validated=t2.first_temp_lpn_validated
AND t1.second_temp_lpn=t2.second_temp_lpn
AND t1.second_WL=t2.second_WL
AND t1.second_temp_lpn_validated=t2.second_temp_lpn_validated
AND t1.third_temp_lpn=t2.third_temp_lpn
AND t1.third_WL=t2.third_WL
AND t1.third_temp_lpn_validated=t2.third_temp_lpn_validated
AND t1.first_temp_rising=t2.first_temp_rising
AND t1.first_WR=t2.first_WR
AND t1.first_temp_rising_validated=t2.first_temp_rising_validated
AND t1.second_temp_rising=t2.second_temp_rising
AND t1.second_WR=t2.second_WR
AND t1.second_temp_rising_validated=t2.second_temp_rising_validated
AND t1.third_temp_rising=t2.third_temp_rising
AND t1.third_WR=t2.third_WR
AND t1.third_temp_rising_validated=t2.third_temp_rising_validated
AND t1.id<t2.id;
EXAMPLE 2:
DELETE FROM Raw_Validated_backup WHERE id NOT IN (
SELECT max(id) FROM Raw_Validated_backup
GROUP BY time_start, time_end, first_temp_lpn, first_WL, first_temp_lpn_validated, second_temp_lpn, second_WL, second_temp_lpn_validated, third_temp_lpn, third_WL, third_temp_lpn_validated, first_temp_rising, first_WR, first_temp_rising_validated, second_temp_rising, second_WR, second_temp_rising_validated, third_temp_rising, third_WR, third_temp_rising_validated )