2

I'm using this SQL statement to find duplicate records:

  SELECT id, 
         user_id, 
         activity_type_id, 
         source_id, 
         source_type, 
         COUNT(*) AS cnt
    FROM activities
GROUP BY id, user_id, activity_type_id, source_id, source_type
  HAVING COUNT(*) > 1

However, I want to not only find, but delete in the same operation.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
keruilin
  • 16,782
  • 34
  • 108
  • 175

1 Answers1

1

delete from activities where id not in (select max(id) from activities group by ....)


Thanks to @OMG Ponies and his other post here is revised solution (but not exactly the same). I assumed here that it does not matter which specific rows are left undeleted. Also the assumption is that id is primary key.

In my example, I just set up one extra column name for testing but it can be easily extended to more columns via GROUP BY clause.

DELETE a FROM activities a 
   LEFT JOIN (SELECT MAX(id) AS id FROM activities GROUP BY name) uniqId 
   ON a.id=uniqId.id WHERE uniqId.id IS NULL;
Alex Gitelman
  • 24,429
  • 7
  • 52
  • 49
  • If the OP is really GROUPing BY `id` as the SELECT suggests, then he has duplicated ids, and this won't remove them. – pilcrow May 25 '11 at 03:34
  • error #1093 - `You can't specify target table 'activities' for update in FROM clause` – OMG Ponies May 25 '11 at 03:37
  • @OMG sorry - I forgot how cranky mysql is. I will play with it and see what comes out. May be some temp table? – Alex Gitelman May 25 '11 at 03:53
  • @pilcrow If rows are identical in all columns the best shot is to craft something with rownum. But my guess is that id is PK in this case. – Alex Gitelman May 25 '11 at 04:01
  • @OMG Ponies - revised. Slightly different than your solution but either one is good. – Alex Gitelman May 25 '11 at 04:52
  • @Alex - assuming `id` is not unique, how would you emulate rownum/rowid in MySQL, and how would you further DELETE using it? – pilcrow May 25 '11 at 05:03
  • @pilcrow if rows have no PK and are exactly identical, how would you delete all of them but one? The only way I can see is to find out specific values in duplicated rows and delete with `where [all values] and rownum>1`. But I don't see it being possible in one sql. – Alex Gitelman May 25 '11 at 05:15
  • @Alex, right, it is not possible in one query without some kind of unique row identifier. – pilcrow May 25 '11 at 05:20