I need to delete duplicates from a table while keeping one item. Since I cannot delete from a table while accessing this very same table in a subquery inside the where-statement, i decided to store my affected IDs in a temporary table:
create temporary my_temp_table (
id int not null
) engine memory;
and then insert IDs using a select:
insert into my_temp_table
-- select query works
select
id
from
-- innodb
table_with_duplicates
where
hash_code in (
select
hash_code
from
table_with_duplicates
group by
hash_code
having
count(id) > 1
)
and date_created < '2015-01-01'
;
Later I want to use these IDs to delete them:
delete from table_with_duplicates
where id in (
select id from my_temp_table
)
;
Just executing the select part of the insert statement works fine. Adding the insert part, however, causes 1 CPU core to go up to 100% and the query seems to never end. Nothing is inserted. On my dev environment, table_with_duplicates contains about 20000 rows with 1 duplicate. Any ideas?
EDIT: Thank you for your answers. I have tried a select distinct(... approach, which didn't help much. Maybe I've used in the wrong place/subselect. Played around with it quite a lot already. To clarify, I have something like this:
ID date_created hash_code
1 2013-06-06 ABCDEFGH <-- delete this one
2 2013-08-08 HGFEDCBA
3 2015-11-11 ABCDEFGH <-- keep this one