Removing duplicates using partition by SQL Server

Question

I need to remove duplicates from a table:

;WITH cte as(
SELECT ROW_NUMBER() OVER (PARTITION BY [specimen id]
                                       ORDER BY ( SELECT 0 ) ) RN
         FROM   quicklabdump)
        delete from cte where RN>1

The column quicklabdumpID is the primary key.

I would like to know how to keep only the largest quicklabdumpID where there are multiple occurrences of [specimen id]

score 19 · Accepted Answer · answered Feb 03 '12 at 04:58

19

Change your order by to quicklabdumpid DESC.

WITH cte as(
  SELECT ROW_NUMBER() OVER (PARTITION BY [specimen id]
                            ORDER BY  quicklabdumpid DESC ) RN
  FROM   quicklabdump)
delete from cte where RN>1

answered Feb 03 '12 at 04:58

Mikael Eriksson

136,425
22
210
281

thanks so much. can you please tell me is there any issue with clint's solution? – Alex Gordon Feb 03 '12 at 17:47
@I__ - It will do the same. There might be a difference in performance. If you want to know which one will be faster you have to test them on your data. – Mikael Eriksson Feb 03 '12 at 18:03

score 6 · Answer 2 · answered Feb 03 '12 at 01:36

6

No need for partition

delete q
  from quicklabdump q
  where exists
  (
    select *
      from quicklabdump q2
      where q2.[specimen id] = q.[specimen id] and
        q2.quicklabdumpID > q.quicklabdumpID
  )

answered Feb 03 '12 at 01:36

Clint Good

820
6
14

just curious, are you deleting from `quicklabdump` here and @I__ is deleting from the `cte`? – cctan Feb 03 '12 at 02:18
@cctan - cte is an alias that is set up using the with statement. – Clint Good Feb 03 '12 at 03:44
@ClintGood thank you so much for this. can you please tell me will i need to run this several times if there are more than 2 duplicate [specimen id]s? for example spec123, spec123, and spec123, with quicklabdumpid 1, 2, 3 – Alex Gordon Feb 03 '12 at 04:23
@I__ This will do it in one go. As the query says if there is a record that has the same specimen id as a record that has a bigger quicklabdumpID then delete it – Clint Good Feb 03 '12 at 05:48

Removing duplicates using partition by SQL Server

2 Answers2

Linked