Remove duplicate rows - sql server

Question

I have tried some solutions from other questions, but they did not work well. I have a table with 2 columns, and I need to remove the row duplicates.

Column1 --- Column2

1 ----------------    3

1 ----------------    1

2 ----------------    1

1 ----------------    1  (remove this)

I found the duplicates with this:

select c1,c2, count(*)
from table  
group by c1,c2
having count(*) > 1

How can I remove the duplicates and keep one entry? (sql server 2008r2)

What are the "some solutions" you tried and how did they not work well? — Michael McGriff, Mar 08 '16 at 14:28
@Frisbee So? Could you elaborate how it collides with my comment? — Lukasz Szozda, Mar 08 '16 at 14:34
@lad2025 I did not look at every solution but the ones I did depended on the unique identifier — paparazzo, Mar 08 '16 at 14:42
@Frisbee The second one is exactly as proposed by Zohar Peled — Lukasz Szozda, Mar 08 '16 at 14:42
This is not a duplicate of [How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows). in the other question there was a unique row identifier but there is no such thing in this question. The fact that the answers to the other question rely on this unique row identifier prevents the OP of using the suggested solutions there (though they are good solutions, no doubt) — Zohar Peled, Mar 08 '16 at 14:58
@ZoharPeled There are plenty of answers on that question that use a CTE without unique identifiers, case closed. The accepted answer might use a unique identifier but the others don't. — user692942, Mar 08 '16 at 15:00
@ZoharPeled You are right. But the following Q&A on SO would be a perfect fit: http://stackoverflow.com/questions/3317433/delete-duplicate-records-in-sql-server (all you need to do is replace the order by clause with `order by (select 1)`). — Ralph, Mar 08 '16 at 15:02
Actually, I now saw Martin Smith's answer on the other question and he explains why he is using the order by (select 0). I don't think it should be much different then my suggestion, but Martin seem to know about tSql much more then me, and I don't have the time for a test case right now. — Zohar Peled, Mar 08 '16 at 15:07
@Ralph seems like there is no difference if the order by is on (select 1) or on col1, col2. execution plan is the same in both cases. — Zohar Peled, Mar 08 '16 at 16:48

score 2 · Accepted Answer · answered Mar 08 '16 at 14:25

2

One option is to use a common table expression with row_number:

;WITH CTE AS
(
    SELECT Col1, Col2, ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY Col1, Col2) As rn
    FROM @MyTable 
)

DELETE
FROM CTE 
WHERE rn > 1

SELECT * 
FROM @MyTable

answered Mar 08 '16 at 14:25

Zohar Peled

79,642
10
69
121

1

Why did you need to use a CTE to do that? Surely a subquery is simpler and backwardly compatible *(ok, noticed thay are using SQL Server 2008 R2)*? Although [`ROW_NUMBER()`](https://msdn.microsoft.com/en-us/library/ms186734.aspx?f=255&MSPPError=-2147217396) didn't come in until 2008 either. – user692942 Mar 08 '16 at 14:28
2

@Lankymart: Common table expressions was introduced in Sql server 2005, so 2008 r2 will support them for sure. about the simplicity, I don't think a sub query or a derived table would be simpler, it does take some getting used to, but cte's are quite simple and readable once you are used to working with them. – Zohar Peled Mar 08 '16 at 14:34
1

I know that, what's your point? There are still SQL Server 2000 servers out there, thankfully I'm not using them. My point about CTE is it just seemed overkill but each to their own. – user692942 Mar 08 '16 at 14:37
@Lankymart: Sql server 2000 is no longer supported by Microsoft, so I don't see a reason to give an answer that will work with it when the question is tagged with the 2008 r2 version. If the tag was Sql server 2000 then my answer would be different. Would you avoid using LAG or LEAD with Sql server 2016 just because it's not supported on 2008 version? As for the cte being an overkill, I don't see how. I doubt if the query plan would be different using a derived table then using a cte. – Zohar Peled Mar 08 '16 at 14:54
Downvoter, care to explain? – Zohar Peled Mar 08 '16 at 14:55
That's fine *(I won't dispute 2000 is no longer supported)*, there is a string of similar answers on the duplicate question so it's been covered. Also before you start rage voting against me I wasn't the down-vote but can if you want me to? – user692942 Mar 08 '16 at 14:57
@Lankymart I didn't even suspected that was you. This is why I asked for an explanation from "Downvoter" - not from you. – Zohar Peled Mar 08 '16 at 15:01
Fine have an up-vote. – user692942 Mar 08 '16 at 15:02
Thanks Zohar! Worked!! – Brupikk Mar 09 '16 at 13:25

Remove duplicate rows - sql server

1 Answers1