2

I have a Table for Customers by name : Customer_SCD in SQL I have 3 Columns present in it : Customer_Name, Customer_ID Customer_TimeStamp

There are duplicate entries in this table with different Timestamp.

For example

ABC, 1, 2012-12-05 11:58:20.370

ABC, 1, 2012-12-03 12:11:09.840

I want to eliminate this from the database and keep the first time/date available.

Thanks.

Prahalad Gaggar
  • 11,389
  • 16
  • 53
  • 71
Niky Rathod
  • 43
  • 1
  • 7

2 Answers2

3

This works, try it:

DELETE  Customer_SCD
OUTPUT  deleted.*
FROM    Customer_SCD b
JOIN    (
    SELECT  MIN(a.Customer_TimeStamp) Customer_TimeStamp,
            Customer_ID,
            Customer_Name
    FROM    Customer_SCD a
    GROUP   BY a.Customer_ID, a.Customer_Name
) c ON 
    c.Customer_ID = b.Customer_ID
AND c.Customer_Name = b.Customer_Name
AND c.Customer_TimeStamp <> b.Customer_TimeStamp

In a subquery it determines which record is the first one for every Customer_Name,Customer_ID and then it deletes all the other records for a duplicate. I also added the OUTPUT clause which returns rows affected by the statement.

You could also do it by using ranking function ROW_NUMBER:

DELETE  Customer_SCD
OUTPUT  deleted.*
FROM    Customer_SCD b
JOIN    (
    SELECT  Customer_ID,
            Customer_Name,
            Customer_TimeStamp,
            ROW_NUMBER() OVER (PARTITION BY Customer_ID, Customer_Name ORDER BY Customer_TimeStamp) num
    FROM    Customer_SCD
) c ON 
    c.Customer_ID = b.Customer_ID
AND c.Customer_Name = b.Customer_Name
AND c.Customer_TimeStamp = b.Customer_TimeStamp
AND c.num <> 1

See which one has a smaller query cost and use it, when I checked it, first approach was more efficient (it had a better execution plan).

Here's an SQL Fiddle

Ivan Golović
  • 8,732
  • 3
  • 25
  • 31
0

The following query will give you the results you want to keep.

Select Customer_Name, Customer_ID, MIN(Customer_TimeStamp) as Customer_TimeStamp
from Customer_SCD 
group by Customer_Name, Customer_ID 

store the result in a table variable, say @correctTbl

then join with this table and remove duplicates

delete 
from Customer_SCD a
inner join @correctTbl b on a.Customer_Name = b.Customer_Name and a.Customer_ID = b.Customer_ID and a.Customer_TimeStamp != b.Customer_TimeStamp
Estefany Velez
  • 328
  • 4
  • 18