How can I remove duplicate rows from one SQL Server database table?

Question

I want to delete the duplicate entries from one table of database (SQL). I don't understand how do I do this?

Please help me to write a query for this problem.

For e.g.-

Name | mobileno | address

abc  | 99xxxx01 | India

abc  | 99xxxx01 | India

def  | 98xxxx02 | USA

Deepthi · Accepted Answer · 2016-09-28T09:00:29.017

3

CREATE TABLE #Duplicate
(
    Column1 VARCHAR(5),
    Column2 VARCHAR(20),
    Column3 VARCHAR(20)
)

INSERT INTO #Duplicate VALUES('abc','99xxxx01','India'),('abc','99xxxx01','India'),('def','xxxx02','USA')

SELECT * FROM #Duplicate

;WITH CTE_Duplicate AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY Column1,Column2,Column3 ORDER BY Column1) AS Rno FROM #Duplicate
)
DELETE FROM CTE_Duplicate WHERE Rno = 2

SELECT * FROM #Duplicate

edited Sep 28 '16 at 09:00

answered Sep 28 '16 at 08:19

Deepthi

86
1
7

This would delete all your data if you'd feed that resultset to your delete.. – Jens Sep 28 '16 at 08:21
Close but no cigar. This would only delete the first occurance of a duplicate. If you have a row 3 or 4 times as a duplicate you'd still have those. You should delete WHERE Rno > 1. – Jens Sep 28 '16 at 09:02
then change the WHERE Rno >= 2 – Deepthi Sep 28 '16 at 09:12

Mathew · Answer 2 · 2016-09-28T08:25:11.110

0

By using a common table expression to group your rows by what's duplicated and deleting all but one of them.

Have a look at this link: http://www.sqlservercentral.com/blogs/vivekssqlnotes/2013/12/29/deleting-duplicate-rows-using-cte/

So yours might look like this:

With cte_duplicate (name, mobileno, address, rownumber)
as (select name, mobileno, address, row_number()over(partition by name, mobileno, address) from your_table_name
)
delete from cte_duplicate where rownumber<>1

edited Sep 28 '16 at 08:25

answered Sep 28 '16 at 08:14

Mathew

4,297
4
29
38

That won't work. There is no field in cte_duplicate called rownumber. Using the name "Rank" as an alias is also less than ideal. – JohnHC Sep 28 '16 at 08:17
Thanks @JohnHC, added the missing rownumber from the cte declaration. – Mathew Sep 28 '16 at 08:25

score 0 · Answer 3 · answered Sep 28 '16 at 08:14

0

with DelCTE as
(
select Name, MobileNo, Address, row_number() over(partition by Name, MobileNo, Address order by 1) as Row_Ord
from TableName
)
delete from DelCTE
where Row_Ord >1

answered Sep 28 '16 at 08:14

JohnHC

10,935
1
24
40

Although this code may help to solve the problem, it doesn't explain _why_ and/or _how_ it answers the question. Providing this additional context would significantly improve its long-term educational value. Please [edit] your answer to add explanation, including what limitations and assumptions apply. – Toby Speight Sep 28 '16 at 08:32

score -1 · Answer 4 · answered Sep 28 '16 at 08:16

-1

;WITH numbered AS (
    SELECT ROW_NUMBER() OVER(PARTITION BY [dupe-column-list] ORDER BY [dupe-column-list]) AS _dupe_num 
    FROM [table-name] 
    WHERE 1=1 -- any where clause if required
)
DELETE FROM numbered WHERE _dupe_num > 1;

This should provide you with a solid result.

answered Sep 28 '16 at 08:16

Jens

3,249
2
25
42

Deleted every row... – Steve Coleman Dec 01 '19 at 05:10

score -2 · Answer 5 · answered Sep 28 '16 at 08:14

-2

Something like this

BEGIN TRAN
SELECT DISTINCT * INTO #TEMP
DELETE TABLE
INSERT INTO TABLE FROM #TEMP
SELECT * FORM TABLE --just to check all is good
-- COMMIT TRAN -- manually execute this if all is good with your data

answered Sep 28 '16 at 08:14

Alexandru Aliu

474
1
4
17

Removing duplicates by dropping table is *BAD* idea, don't do this. – nimdil Sep 28 '16 at 08:19
Dude, read the SQL Documentation DELETE is different from DROP. DELETE Table has the same result as DELETE FROM Table but it's actually faster. – Alexandru Aliu Sep 28 '16 at 08:21
1

If we are assuming we're talking MS SQL the FROM keyword is optional indeed, but claiming it's faster is a bit far fetched. It's pretty much like JOIN & INNER JOIN. – Jens Sep 28 '16 at 08:25
That's different, you might be right , but DELETE does not DROP the table. – Alexandru Aliu Sep 28 '16 at 08:26
My algorithm its't exactly like the other answers. Getting the unique records in a temp table, delete the source, insert unique lines. I don't get it – Alexandru Aliu Sep 28 '16 at 08:27
1

Correct, you aren't deleting the table, but you're deleting EVERYTHING inside it, resulting in a full table lock which could take quite a while in an active environment given you'd also have to rebuild all the indexes attached to it. This is not actually seen as best practice. – Jens Sep 28 '16 at 08:29
Whatever. I have just provided the general idea not a complete solution - this is what he asked for. Maybe he has only 3 lines in his table. At least you have a point, nimdil hasn't ! – Alexandru Aliu Sep 28 '16 at 08:36
True, but you should always strive for best practice. Let's say this is a 200k row table & there's one duplicate.. Or this table has several foreign keys in other tables. This is a quick an dirty solution which could work in a test environment, but please don't ever do this in a production environment. You just wouldn't make any friends with it. – Jens Sep 28 '16 at 08:38
@AlexandruAliu I'm not in your head. When you write "Delete Table" then I'm not even sure what you mean. Deleting whole content of a table for that purpose is marginally better but still a long journey into bad practice zone. AFAIC -1. General idea based on emptying a table is just bad general idea. – nimdil Sep 29 '16 at 13:20

How can I remove duplicate rows from one SQL Server database table?

5 Answers5