Removing duplicate SQL table rows

Question

My sql table is have not primary key column. I can find dublicate rows but i dont know how i can remove to one to rest; Let me explain;

col1     col2    col3    col4
10       0       1000    1    
10       0       1000    1    --> should be deleted
10       0       1111    2    --> should be deleted
10       1       1000    1
10       2       1000    1
15       0       1000    1
15       0       1000    1    --> should be deleted
16       0       1000    1

I use col1 and col2 for understand row is dublicate. Same time "10" "0" must be unique but tables can contains many "10" or many "0" value.

Thanks.

My first thought: Add an auto-increment id and use that one to delete the duplicates. — juergen d, Mar 11 '14 at 19:56
What SQL are you using ? (MySQL, PostgreSQL, MSSQL, Oracle, ...) — deroby, Mar 11 '14 at 19:57
@user2672165 Actually i search in net before ask it. But many examples have one different value in columns. My dublicates are totally same sometimes. I must match 1st and 2nd columns.. — Black White, Mar 11 '14 at 20:02
The referenced questions all have an 'extra' id column that seems to be lacking here making their solutions not applicable. — deroby, Mar 11 '14 at 20:12
Hmm.. just realized I went over the top and assumed you wanted all 4 columns to identify uniqueness. I'll adapt so it only looks at col1 and col2, but the question then becomes: how do you know you want the 2nd and 3rd row removed and not the 1st one ? Or doesn't it matter as long as you only keep 1 of the three, whichever that may be. — deroby, Mar 11 '14 at 20:16
@deroby col3 and col4 can be different or can be same not a problem. But col1="10" and col2="0" combination can only be one. Or col1="11" and col2="0" etc. — Black White, Mar 11 '14 at 20:21

deroby · Accepted Answer · 2014-03-11T20:22:45.163

This should work. It first finds out how many duplicates there are per combination and then deletes them all but one.

CREATE TABLE t_test (col1 int, col2 int, col3 int, col4 int)

INSERT t_test 
          SELECT 10, 0, 1000, 1 
UNION ALL SELECT 10, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 10, 0, 1111, 2 --> should be deleted
UNION ALL SELECT 10, 1, 1000, 1
UNION ALL SELECT 10, 2, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 16, 0, 1000, 1

DECLARE @col1 int, @col2 int, @count int

DECLARE delete_loop CURSOR LOCAL STATIC
    FOR SELECT COUNT(*), col1, col2
          FROM t_test
         GROUP BY col1, col2
        HAVING COUNT(*) > 1
OPEN delete_loop
FETCH NEXT FROM delete_loop INTO @count, @col1, @col2
WHILE @@FETCH_STATUS = 0
    BEGIN
        DELETE TOP (@count - 1)
          FROM t_test
         WHERE col1 = @col1
           AND col2 = @col2

        FETCH NEXT FROM delete_loop INTO @count, @col1, @col2
    END
CLOSE delete_loop
DEALLOCATE delete_loop

SELECT * FROM t_test

EDIT: Adapted to only look at col1 and col2 for uniqueness.

You're welcome. If it would be too slow you can first add an index on col1 & col2, run the fix and them remove the index again. That is, if there is no such an index yet off course. — deroby, Mar 11 '14 at 20:45

score 0 · Answer 2 · answered Mar 11 '14 at 20:52

Here is an easy method to identify duplicates, and delete them.

Add an id that increments for each combination of col1 & col2 (partition by), wrap that up in a CTE, and delete the records that are not equal to 1 (the first occurrence).

DECLARE @Test TABLE (col1 int, col2 int, col3 int, col4 int)

INSERT @Test 
          SELECT 10, 0, 1000, 1 
UNION ALL SELECT 10, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 10, 0, 1111, 2 --> should be deleted
UNION ALL SELECT 10, 1, 1000, 1
UNION ALL SELECT 10, 2, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1
UNION ALL SELECT 15, 0, 1000, 1 --> should be deleted
UNION ALL SELECT 16, 0, 1000, 1

;WITH DUPES
AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY COL1,COL2 ORDER BY COL1,COL4) AS myID
FROM @Test
)

DELETE D
FROM DUPES D
WHERE myID <> 1

SELECT * 
FROM @Test

Removing duplicate SQL table rows

2 Answers2