0

I know how to delete duplicate records using CTE.

How can we achieve this without using CTE [Interview Question]

Is it possible?

Example: Table with the following data:

name     salary
-----------------
Husain   20000.00
Husain   20000.00
Husain   20000.00
Munavvar 50000.00
Munavvar 50000.00

After deleting the duplicate records table should contains data like this :

name     salary
-----------------
Husain   20000.00
Munavvar 50000.00
Munavvar
  • 802
  • 1
  • 11
  • 33

7 Answers7

4

If you have to delete duplicate rows retaining one such row, you can do so by using ROW_NUMBER() function in SQL. Delete all rows with row number > 1. Assuming you know how ROW_NUMBER() and PARTITION works. If not you can get more information on this on msdn.

DELETE A
FROM
(
SELECT name,salary, ROW_NUMBER() OVER (PARTITION BY name,salary ORDER BY name) AS rn
FROM Tab
) A
WHERE A.rn > 1

Just a heads-up on the response from @TimSchmelter, it will delete all existence of duplicates. Be aware of the difference, you can use either as per your requirement.

Akanksha Singh
  • 351
  • 2
  • 9
  • Thanks Akanksha for response. but it also deleting all the records – Munavvar May 31 '16 at 10:29
  • No it will not. Make sure you are partitioning only with the columns that defines 'DUPLICATE' records. – Akanksha Singh May 31 '16 at 10:48
  • yes it is deleting all rows.. please try with given example (Check question) – Munavvar May 31 '16 at 10:51
  • Are you running exactly what I have shared (now edited with your column names)? If you understand the crux, you can write column names which define duplicate rows. But this should work 100% :) – Akanksha Singh May 31 '16 at 11:09
  • @Munavvar: i thought you need a solution without using a CTE **and** rdbms specific ranking functions like `ROW_NUMBER`. Are you sure that interviewer didn't want an ANSI sql solution which doesn't require SQL-Server >= 2005? – Tim Schmelter May 31 '16 at 15:23
  • @TimSchmelter: interviewer asked me to delete duplicate records. I completed the query with using CTE. After that he asked me to do same without using CTE. – Munavvar Jun 01 '16 at 06:11
2

With sub queries and GROUP BY HAVING COUNT(*) > 1.

Presuming that you want to find duplicates according to two columns col1 and col2:

Query to list all duplicates:

SELECT t.*
FROM dbo.TableName t
INNER JOIN 
(
   SELECT [col1], [col2], Cnt = COUNT(*)
   FROM dbo.TableName t
   GROUP BY [col1], [col2]
   HAVING COUNT(*) > 1
) Duplicates
ON t.[col1] = Duplicates.[col1] AND t.[col2] = Duplicates.[col2]

Delete them:

DELETE dbo.TableName
FROM dbo.TableName t
INNER JOIN 
(
   SELECT [col1], [col2], Cnt = COUNT(*)
   FROM dbo.TableName t
   GROUP BY [col1], [col2]
   HAVING COUNT(*) > 1
) Duplicates
ON t.[col1] = Duplicates.[col1] AND t.[col2] = Duplicates.[col2]
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • Thanks Tim for quick response. but this will delete all the record with duplicate value. I want to delete repeated data. – Munavvar May 31 '16 at 10:02
  • @Munavvar: so all but one? Here are some approaches: https://www.simple-talk.com/sql/t-sql-programming/removing-duplicates-from-a-table-in-sql-server/ – Tim Schmelter May 31 '16 at 10:04
0
  1. Correlated Subqueries
  2. Self Joins
Sean Pearce
  • 1,150
  • 5
  • 10
0

select * from DuplicateRcordTable

enter image description here

WITH CTE 
AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Col3 ORDER BY Col3) AS DuplicateCount
FROM DuplicateRcordTable
)
DELETE
FROM CTE
WHERE DuplicateCount > 1

select * from DuplicateRcordTable2

enter image description here

0

Most common use of Common Table Expression (CTE) is Deleting duplicate record from table which not having Primary Key

WITH a as
(
SELECT A,ROW_NUMBER() OVER(PARTITION by A, B ORDER BY A) 
AS duplicateRecCount dbo.A
)
--Now Delete Duplicate Records
DELETE FROM A
WHERE duplicateRecCount > 1 [enter image description here][1]
Bheemsen Singh
  • 141
  • 1
  • 10
  • The question from the O.P. was how to do it WITHOUT using a CTE. Not every SQL RDBMS offers CTEs. I know of at least two such systems. – Forbin Feb 15 '22 at 22:53
0

We can also Delete Duplicate Record from Table by Using Cursor or by using another lopping statements. following code spinets deletes Duplicates Record by using Cursor

  DECLARE @A nvarchar(50), @B nvarchar(50) , @C nvarchar(50), @TotalDuplicate INT ; 



DECLARE a_cursor CURSOR FOR   
SELECT A, B  ,C ,COUNT(1)-1 AS TotalDuplicate
FROM A  
GROUP BY  A, B  ,C 
HAVING COUNT(1)>1

OPEN a_cursor  

FETCH NEXT FROM a_cursor INTO @A, @B,@C,@TotalDuplicate  

WHILE @@FETCH_STATUS = 0  
BEGIN  

    DELETE TOP (@TotalDuplicate) FROM A WHERE A=@A AND B=@B AND C=@C 

    FETCH NEXT FROM a_cursor INTO @A, @B,@C,@TotalDuplicate  

END

CLOSE a_cursor;  
DEALLOCATE a_cursor;  
Bheemsen Singh
  • 141
  • 1
  • 10
0
Select *  into #temp 

from Emp_Details 

group by Name, Salary

having count(*) =1 

Now, Truncate the original table

Truncate table Emp_Details 

Again, inserting the records from temporary table to Emp_Details

insert Emp_Details (Name, Salary, Address)

Select Name, Age, Address from #temp
Anshul Dubey
  • 349
  • 4
  • 14