0

How can I avoid duplicate using the provided below query

INSERT INTO dbo.Entities(EntityId, [Name], [Description], [Type], Source)
    SELECT DISTINCT 
        CUST_CODE, NAME, FULLDESCRIPTION, 'Agency' AS Type, 'SunDbAgencies' AS Source
    FROM dbo.VW_SUNDB_AGENCIES 

I've already tried all the answers here :

Avoid duplicates in INSERT INTO SELECT query in SQL Server to no avail.

The duplicates are in the dbo.VW_SUNDB_AGENCIES TABLES not the INSERT table, So I gather I need a way to remove duplicate from the select before inserting

Here is one of the duplicates which is why a simple distinct doesn't work:

enter image description here

enter image description here

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
Jesse
  • 1,846
  • 2
  • 22
  • 32

1 Answers1

2

The problem is that select distinct is not sufficient. You still have duplicates in the underlying table, but with different names or descriptions.

I view this as a problem. But, you can work around it by selecting one arbitrary row per cust_code, using row_number():

insert into dbo.Entities (EntityId, [Name], [Description], [Type], Source)
  select CUST_CODE, NAME, FULLDESCRIPTION, 'Agency' AS [Type], 'SunDbAgencies' AS Source
  from (select a.*,
               row_number() over (partitoin by cust_code order by cust_code) as seqnum
        from dbo.VW_SUNDB_AGENCIES a
       ) a
  where seqnum = 1 and
        not exists (select 1 from dbo.Entities E where A.CUST_CODE = E.EntityId);
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • The view is definitely a problem , but I can't change it. The provided script works perfectly. Thanks – Jesse Nov 20 '19 at 21:38
  • I just faced the same problem a couple days ago. I love the 'Where rownumber = 1' solution. Very, very, very elegant. – ASH Nov 23 '19 at 23:46