Delete duplicates but keep 1 with multiple column key

Question

I have the following SQL select. How can I convert it to a delete statement so it keeps 1 of the rows but deletes the duplicate?

select s.ForsNr, t.* 
from [testDeleteDublicates] s
join (
    select ForsNr, period,  count(*) as qty
    from [testDeleteDublicates]
    group by ForsNr, period 
    having count(*) > 1
) t on s.ForsNr = t.ForsNr and s.Period = t.Period

will u please show the data too or create a fiddle? Do you want to Keep the OLDEST record and delete new records from 'testDeleteDublicates' ? i.e Delete all other records which are added later and keep the older one? — SHEKHAR SHETE, Jul 07 '14 at 09:29
Added `sql-server` tag due to the usage of non-standard `[..]` "quoting" — , Jul 07 '14 at 09:43

SHEKHAR SHETE · Accepted Answer · 2014-07-07T11:05:48.007

0

Try using following:

Method 1:

DELETE FROM Mytable WHERE RowID NOT IN (SELECT MIN(RowID) FROM Mytable GROUP BY Col1,Col2,Col3)

Method 2:

;WITH cte
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY ForsNr, period 
                                       ORDER BY ( SELECT 0)) RN
         FROM   testDeleteDublicates)
DELETE FROM cte
WHERE  RN > 1

Hope this helps!

NOTE: Please change the table & column names according to your need!

edited Jul 07 '14 at 11:05

answered Jul 07 '14 at 09:26

SHEKHAR SHETE

5,964
15
85
143

We are only looking at a single table. No ID column – user3688856 Jul 07 '14 at 09:28
I have Updated my answer please check another method for the same! – SHEKHAR SHETE Jul 07 '14 at 11:06

score 0 · Answer 2 · answered Jul 07 '14 at 09:31

This is easy as long as you have a generated primary key column (which is a good idea). You can simply select the min(id) of each duplicate group and delete everything else - Note that I have removed the having clause so that the ids of non-duplicate rows are also excluded from the delete.

delete from [testDeleteDublicates]
where id not in (
    select Min(Id) as Id
    from [testDeleteDublicates]
    group by ForsNr, period 
)

If you don't have an artificial primary key you may have to achieve the same effect using row numbers, which will be a bit more fiddly as their implementation varies from vendor to vendor.

It deletes all my rows :) – user3688856 Jul 08 '14 at 14:25 — user3688856, Jul 08 '14 at 14:25

score 0 · Answer 3 · edited May 23 '17 at 11:57

You can do with 2 option.

Add primary-key and delete accordingly

http://www.mssqltips.com/sqlservertip/1103/delete-duplicate-rows-with-no-primary-key-on-a-sql-server-table/

'2. Use row_number() with partition option, runtime add row to each row and then delete duplicate row.

Removing duplicates using partition by SQL Server

--give group by field in partition.
;with cte(
    select ROW_NUMBER() over( order by ForsNr, period partition ForsNr, period) RowNo , * from [testDeleteDublicates]
    group by ForsNr, period 
    having count(*) > 1
)

select RowNo from cte 
group by ForsNr, period

Delete duplicates but keep 1 with multiple column key

3 Answers3