Mysql delete duplicate rows + sort condition

Question

I'm trying to delete some rows from my Mysql table, when one key is duplicate (here "url") and keep a particular one (here the smallest "key1" and "key2")

Example :

Table t1
Id    Url    Key1    Key2
1     A.com   10      10
2     B.com   20      25
3     B.com   21      25
4     C.com   35      35
5     C.com   35      37

Desired output is :

Table t1
Id    Url    Key1    Key2
1     A.com   10      10
3     B.com   21      25
5     C.com   35      37

So the query (if it exists) should look like :

Select rows where Url are duplicate
Then sort by Key1 and remove the row where Key1 is strictly inferior
if Key1 are equal, remove the row where Key2 is inferior

Thanks

possible duplicate of [How to delete duplicates on a MySQL table?](http://stackoverflow.com/questions/2630440/how-to-delete-duplicates-on-a-mysql-table) — Norbert, Sep 20 '15 at 15:31

score 1 · Accepted Answer · answered Sep 20 '15 at 14:41

You want to keep the rows where key1, key2 are maximal. An easy way to express this is:

delete t1 from table t1
    where exists (select 1
                  from t1 t11
                  where t11.url = t1.url and
                        (t11.key1 > t1.key1 or
                         t11.key1 = t1.key1 and t11.key2 > t1.key2
                        )
                 );

Alas, MySQL doesn't allow this construct, because you using the table being deleted. So, you can do something like this:

delete t1
    from t1 left join
         (select t.*,
                 (select max(key2)
                  from t1
                  where t1.url = t.url and t1.key = t.maxkey1
                 ) as maxkey2
          from (select url, max(key1) as maxkey1
                from t1
                group by url
               ) t
        ) t
        on t1.url = t.url and t1.key1 = t.maxkey1 and t2.key2 = t.maxkey2
    where t.url is null;

score 0 · Answer 2 · answered Sep 20 '15 at 14:40

I think this might be helpful

DELETE t1
  FROM t1 as tb1
  join t1 as tb2
 WHERE tb1.url= tb2.url
   and tb1.id < tb2.id

This way you keep the record with the max value on id column

but if you just want to fetch records

SELECT distinct tb1.*
  FROM t1 as tb1
  join t1 as tb2
 WHERE tb1.url= tb2.url
   and tb1.id < tb2.id

Mysql delete duplicate rows + sort condition

2 Answers2