How to eliminiate duplicates rows in table of sql server 2005 database?

Question

i am using sql server 2005. i have a table contains duplicated rows. how can i eliminated those duplicate rows in that table? for e.g., the table may contain 3 similar rows in which i want to delete 2 rows and keep the original

Please **search** first before posting a question - this has been asked over and over and over again - — marc_s, Nov 26 '10 at 10:56
possible duplicate of [SQL: Deleting duplicate records in SQL Server](http://stackoverflow.com/questions/1772592/sql-deleting-duplicate-records-in-sql-server) — marc_s, Nov 26 '10 at 10:56
Get more lots http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows Looks like ppl like to hate duplicates — Gennady Vanin Геннадий Ванин, Nov 26 '10 at 14:54

score 1 · Answer 1 · edited Nov 26 '10 at 10:54

1

If you just want to eliminate duplicate records from your result set, you can use the DISTINCT command:

SELECT DISTINCTI field1, field2 FROM...

If you want to delete those duplicate records, you can use COUNT to detect which records have more than one instance, and then deleting them with a subquery

edited Nov 26 '10 at 10:54

marc_s

732,580
175
1,330
1,459

answered Nov 26 '10 at 10:36

franz976

333
3
14

If you post code or XML, **please** highlight those lines in the text editor and click on the "code" button (101 010) on the editor toolbar to nicely format and syntax highlight it! – marc_s Nov 26 '10 at 10:54
the action you said above will delete all the duplicated rows but 1 want to keep one – Kumaran T Nov 26 '10 at 10:55
Not if you use MAX(id) in the subquery to return you just the first id instance – franz976 Nov 26 '10 at 11:26
Using a DISTINCT may be a lot more expensive than using the CTE method above with a large dataset or wide table. – DataWriter Nov 27 '10 at 12:40

score 0 · Answer 2 · answered Nov 26 '10 at 10:58

First you can copy duplicate record into another table like as following way

Select fieldnames into #temp from table1 group by fieldnames having count (*) > 1

then remove that record from original table

delete from table1 where fieldname in (select fieldnames from #temp)

and finally copy record from temporary table to original source table.

insert table1 select * from #temp

through above mentioned steps you can eliminate duplicate record from table.

score 0 · Answer 3 · answered Nov 26 '10 at 11:14

For understanding purpose, lets take a simple table Employee with below schema

EmployeeId - int
EmployeeName    varchar(50)
Age int

Lets populate with duplicate values. Please note primary key is not duplicated in this case

INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (1,'Mark',20)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (2,'Tom',22)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (3,'Sam',24)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (4,'Mark',20)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (5,'Tom',22)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (6,'Tom',22)
GO

we can make use of CTE in finding the duplicate rows. Gather duplicate rows by using Group by/Count statement. Once Duplicate rows are Identified, we select those rows from the main table using join condition. Now rank those Rows and delete all the rows apart from the rows with rank 1. I find this a lot more elegant.

WITH TotalDuplicates(EmployeeName,Age,Total) AS 
(
    SELECT EmployeeName,Age,COUNT(employeeId)  AS Total  FROM Employee 
    GROUP BY EmployeeName,Age
    HAVING COUNT(employeeId) > 1
)
,DistinctRows(EmployeeId,EmployeeName,Age) AS 
(
    SELECT E.EmployeeId,E.EmployeeName,E.Age FROM Employee AS E
    INNER JOIN TotalDuplicates AS D 
    ON (E.EmployeeName = D.EmployeeName AND E.Age = D.Age)
)
,OrderedDuplicateTables(EmployeeId,EmployeeName,Age,Ranking) AS 
(
    SELECT 
        EmployeeId,
        EmployeeName,
        Age,
        RANK() OVER (PARTITION BY EmployeeName, Age ORDER BY EmployeeId DESC) 
    FROM DistinctRows 
)

DELETE FROM Employee
WHERE EmployeeId IN (SELECT EmployeeId FROM OrderedDuplicateTables WHERE Ranking > 1)

How to eliminiate duplicates rows in table of sql server 2005 database?

3 Answers3