0

I’m having trouble trying to de-duplicate records. My data is as follows:

ID   -   AccountId  - ContactId  -   CreatedDATE    
1   -      100   -    1000    -      2017/11/25   
2    -    100    -   1000      -    2016/07/29  
3    -    200    -   2000   -       2015/12/25    
4   -     200  -     2000    -      2012/01/01     
5   -     300   -    3000    -      2010/09/15    
6    -     300   -    3000      -    2019/05/15    
7     -   300     -  3000     -     2018/03/25    
8     -   400    -   4000  -        2010/01/05    
9      -  400     -  4000   -       2011/01/05    
10    -   400  -     4000   -       2019/01/05    
11      - 400  -     4000   -       2014/01/05

I want to remove duplicate AccountId and ContactId records and want to retain the the record from the duplicate with the latest CreatedDate.

Essentially, from this sample data set I only want to keep records with ID as 1, 3, 6 and 10 (duplicate removed and retains latest createddate).

I’ve made a few attempts but am having trouble. Looking for suggestions.

Dale K
  • 25,246
  • 15
  • 42
  • 71
MBP
  • 17
  • 4
  • https://meta.stackexchange.com/questions/220127/how-to-make-tables-in-good-format-in-asking-questions-here-which-is-understanda – Dale K Feb 21 '19 at 00:43
  • What attempts have you tried? And are you aware of how window functions like `ROW_NUMBER()` work? – ZLK Feb 21 '19 at 00:50
  • 1
    Possible duplicate of [How can I remove duplicate rows?](https://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows) – Ken White Feb 21 '19 at 00:52

2 Answers2

1
;with CTE as (Select *
    , Row_Number() over (partition by AccountID, ContactID order by CreatedDate desc) as RN
from MyTable)

Select * from CTE where RN = 1

This should return only the first row for each unique accountID/contactID. Note that if there are ties on create date, it will choose one of the rows pseudorandomly.

APH
  • 4,109
  • 1
  • 25
  • 36
0

This should work

select a.* 
    from #test a 
inner join 
(
select accountid, 
    ContactId, 
    max(createddate) as createddate 
    from #test 
    group by accountid, ContactId
) b on a.accountid = b.accountid and a.ContactId = b.ContactId and a.createddate = b.createddate
Shiju Samuel
  • 1,373
  • 6
  • 22
  • 45