Delete duplicate row based on column but left only 1 row of the matching duplicate

Question

I have table as below and I need to delete row as per commented out below...

ID  |   AccountID   |   AccountValue|    CreatedDate            | Comment
===========================================================================
1   |   1           |   2           |   2016-06-13 19:58:47.373 | Delete
2   |   1           |   2           |   2017-06-13 19:58:47.373 | Delete
3   |   1           |   2           |   2018-06-13 19:58:47.373 |
4   |   2           |   3           |   2017-06-13 19:58:47.373 |
5   |   4           |   4           |   2017-06-13 19:58:47.373 | Delete
6   |   4           |   4           |   2018-06-13 19:58:47.373 |
7   |   5           |   2           |   2017-06-13 19:58:47.373 |

Can someone please help on who to achieve this?

I have somehow

CREATE TABLE MyAccounts (
    ID int,
    AccountID int,
    AccountValue varchar(255),
    CreatedDate datetime
);

insert into MyAccounts values(1,1,2,'2016-06-13 19:58:47.373')
insert into MyAccounts values(2,1,2,'2017-06-13 19:58:47.373')
insert into MyAccounts values(3,1,2,'2018-06-13 19:58:47.373')
insert into MyAccounts values(4,2,3,'2017-06-13 19:58:47.373')
insert into MyAccounts values(5,4,4,'2017-06-13 19:58:47.373')
insert into MyAccounts values(6,4,4,'2018-06-13 19:58:47.373')
insert into MyAccounts values(7,5,2,'2017-06-13 19:58:47.373')

I know I want to delete this group of data, but want to left 1 row per explained in original question

select 
        AccountID, 
        AccountValue        
        FROM MyAccounts
        GROUP BY AccountID, AccountValue--, createddate
        having count(*) > 1

This is so that the table become as per below

ID  |   AccountID   |   AccountValue|    CreatedDate            | Comment
===========================================================================
3   |   1           |   2           |   2018-06-13 19:58:47.373 |
4   |   2           |   3           |   2017-06-13 19:58:47.373 |
6   |   4           |   4           |   2018-06-13 19:58:47.373 |
7   |   5           |   2           |   2017-06-13 19:58:47.373 |

Possible duplicate of [T-SQL: Deleting all duplicate rows but keeping one](https://stackoverflow.com/questions/6025367/t-sql-deleting-all-duplicate-rows-but-keeping-one) — Eric Brandt, Jul 16 '18 at 11:17

score 0 · Answer 1 · answered Jul 16 '18 at 09:34

You can use row_number() function :

delete m 
from (select *, row_number() over (partition by AccountID, AccountValue order by id desc) as seq
      from MyAccounts
     ) m
where m.seq > 1;

Perhaps you also need createddate field in partition clause. If, so you can include it.

score 0 · Answer 2 · answered Jul 16 '18 at 09:35

0

delete from MyAccounts
where id not in 
(
     select min(id)     
     FROM MyAccounts
     GROUP BY AccountID, AccountValue, createddate
)

answered Jul 16 '18 at 09:35

juergen d

201,996
37
293
362

score 0 · Accepted Answer · answered Jul 16 '18 at 09:36

0

Use a CTE with ROW_NUMBER window fucntion

;WITH cteDups
AS(
    SELECT *, RN=ROW_NUMBER()OVER (PARTITION BY M.AccountID, M.AccountValue ORDER BY M.ID DESC)
    FROM dbo.MyAccounts M
)
--SELECT *
DELETE
FROM cteDups D WHERE D.RN > 1

answered Jul 16 '18 at 09:36

Mazhar

3,797
1
12
29

Delete duplicate row based on column but left only 1 row of the matching duplicate

3 Answers3