I have a table with duplicate entries which I want to remove, keeping none but one of the duplicate entries. As you can see, they are exactly the same in every colum, there's no way to differentiate them:
I've used this query to figure out how many duplicates I have:
select url_rewrite_id, category_id, product_id, count(*) cnt
from catalog_url_rewrite_product_category
group by url_rewrite_id, category_id, product_id
having cnt > 1
order by cnt desc
I could use a variant of this to remove all duplicates:
delete
from catalog_url_rewrite_product_category
where url_rewrite_id in (
select url_rewrite_id
from catalog_url_rewrite_product_category
group by url_rewrite_id, category_id, product_id
having count(*) > 1
)
The problem I have with this is it would remove all entries that are duplicates and wouldn't keep the last one.
Earlier questions (here and here) assume a unique id
column of sorts which is not the case with the data structure I have.