User Defined Value for Deleting Duplicates in Stored Procedure

Question

I am currently making use of the code below:

ALTER PROCEDURE [dbo].[sp_TidyBasket]
    @ReferenceNumber VARCHAR
AS
BEGIN
    BEGIN TRANSACTION

    DECLARE
        @GUID VARCHAR,
        @ErrorCode INT

    --Get the First Reference Number of a basket item being duplicated
    SET @GUID = (SELECT TOP 1 MIN(idx6) 
                 FROM iwfAccountOpening 
                 WHERE Idx29 = @ReferenceNumber 
                 GROUP BY Idx37 
                 HAVING COUNT(*) > 1)   

    --Executes a while loop whilst there is duplicates to be removed
    WHILE (@GUID IS NOT NULL)
    BEGIN
        DELETE FROM iwfAccountOpening WHERE Idx6 = @GUID;
    END

    --Rollbacks transactions when any errors occur
    SELECT @ErrorCode = @@ERROR

    IF (@ErrorCode <> 0) GOTO PROBLEM
    COMMIT TRANSACTION

PROBLEM:
    IF (@ErrorCode <> 0) BEGIN
        ROLLBACK TRANSACTION
    END
END

I want it to loop through and delete duplicates (whilst retaining one row of the duplicate data) according to a defined value passed through to it.

Currently this is not showing any changes on my database table. I understand it would be due to the @GUID value I am setting but I have no idea how to approach this otherwise.

This is not a duplicate question of the previously mentioned duplicate question as it focuses' on the deletion of constraints placed by the user WITH duplicates.

possible duplicate of [T-SQL: Deleting all duplicate rows but keeping one](http://stackoverflow.com/questions/6025367/t-sql-deleting-all-duplicate-rows-but-keeping-one) — Mike D., Mar 06 '15 at 12:06
Well spotted but I did read through that question and mine related specifically to a unique user defined value for deleting duplicates as the act of deleting duplicates itself is quite simple. — svbyogibear, Mar 10 '15 at 11:54

score 2 · Accepted Answer · answered Mar 06 '15 at 12:13

Try with this:

ALTER PROCEDURE [dbo].[sp_TidyBasket]
    @ReferenceNumber VARCHAR
AS
    BEGIN TRY

        BEGIN TRANSACTION;

            WITH    cte
                      AS ( SELECT   idx6 ,
                                    ROW_NUMBER() OVER ( PARTITION BY Idx37 ORDER BY ( SELECT ( 1 ) ) ) AS RN
                           FROM     iwfAccountOpening
                           WHERE    Idx29 = @ReferenceNumber
                         )
            DELETE  i
            FROM    cte c
                    JOIN iwfAccountOpening i ON c.idx6 = i.idx6
            WHERE   Idx29 = @ReferenceNumber AND RN > 1

        COMMIT TRANSACTION

    END TRY
    BEGIN CATCH  

        IF @@TRANCOUNT > 0
            ROLLBACK TRANSACTION

    END CATCH                
GO

score 1 · Answer 2 · answered Mar 06 '15 at 12:05

Try this. No loop required.

 DELETE FROM iwfAccountOpening WHERE Idx6 in (SELECT MIN(idx6) 
                 FROM iwfAccountOpening 
                 WHERE Idx29 = @ReferenceNumber 
                 GROUP BY Idx37 
                 HAVING COUNT(*) > 1) and Idx29 = @ReferenceNumber

Khurram Ali · Answer 3 · 2015-03-06T12:16:06.770

1

you can find all duplicate records by following sample query

SELECT YourColumn, COUNT(*) TotalCount
FROM YourTable
GROUP BY YourColumn
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

In Your Case you can also do this

Delete from iwfAccountOpening Where Idx29 in
 (           SELECT Count(idx6) 
             FROM iwfAccountOpening 
             WHERE   Idx29 = @ReferenceNumber
             GROUP BY Idx6,Idx37 
             HAVING COUNT(*) > 1 
             ORDER BY COUNT(*) DESC )
   )

edited Mar 06 '15 at 12:16

answered Mar 06 '15 at 12:05

Khurram Ali

1,659
4
20
37

I am able to find the correct duplicates with the code sample I provided, but thank you for the response. – svbyogibear Mar 06 '15 at 12:09

score 1 · Answer 4 · answered Mar 06 '15 at 12:06

This cycle:

WHILE (@GUID IS NOT NULL)
BEGIN
    DELETE FROM iwfAccountOpening WHERE Idx6 = @GUID;
END

looks pretty strange.

First, if @GUID is not null - this will be infinite loop because you're not modifying @GUID inside loop.

Second - this delete statement will not help you to achieve your goal "delete all duplicate except one" - just because here you're deleting only one record with GUID you determinted before.

Based on your logic, it should be something like

DELETE FROM iwfAccountOpening 
WHERE Idx29 = @ReferenceNumber and idx6 <> @GUID

And no cycle needed.

Also SELECT TOP 1 MIN(idx6) looks little bit redundant for me. It can be just SELECT MIN(idx6) and you will get the same result.

User Defined Value for Deleting Duplicates in Stored Procedure

4 Answers4