Microsoft T-SQL - Is there a way to delete records that have duplicates?

Question

I'm trying to add a compound primary key to a table I have called "Stock". Since there are duplicates in the primary key i desire, it won't allow me to do add the primary key, so is there a way to delete 1 of each duplicate in a single query instead of having to do it manually?

I've found all the duplicates using this query (67 total rows with duplicates):

SELECT Count(*), STK, yr, mn, dy
FROM Stock
GROUP BY STK, yr, mn, dy
HAVING count(*) > 1
ORDER BY STK ASC;

Possible duplicate of [How to delete duplicates on a MySQL table?](https://stackoverflow.com/questions/2630440/how-to-delete-duplicates-on-a-mysql-table) — ceejayoz, Apr 16 '19 at 20:59
Possible duplicate of [How to delete duplicate rows in sql server?](https://stackoverflow.com/questions/18390574/how-to-delete-duplicate-rows-in-sql-server) — Gen Wan, Apr 16 '19 at 21:16
I've tried the ALTER IGNORE TABLE `table_name` ADD UNIQUE (title, SID) solution and has not worked since IGNORE was not recognized. Also im using Microsoft SQL. — il5amees, Apr 16 '19 at 21:20
It is very likely you will regret separating the components of a date into their own columns. With this approach, you will find it difficult to enforce the proper domains of each column - which is something that will automatically happen with the use of the proper (date) datatype. — SMor, Apr 17 '19 at 12:36

score 1 · Answer 1 · answered Apr 16 '19 at 21:30

1

You can use a CTE to get ROW_NUMBER - any duplicates will have a ROW_NUMBER greater than 1 (try this in a test copy of the database, not production):

; WITH AllRecords
AS (
    SELECT STK, yr, mn, dy, RowNum = ROW_NUMBER()OVER(PARTITION BY STK, yr, mn, dy ORDER BY STK)
    FROM Stock
    )
DELETE s
FROM Stock s
    JOIN [AllRecords] a
        ON s.STK = a.STK
        AND s.yr = a.yr
        AND s.mn = a.mn
        AND s.dy = a.dy
WHERE a.[RowNum] > 1;

answered Apr 16 '19 at 21:30

Russell Fox

5,273
1
24
28

Side note: you can change the DELETE to SELECT * to ensure it's numbering rows correctly. – Russell Fox Apr 16 '19 at 21:33

score 0 · Answer 2 · answered Apr 16 '19 at 21:29

I'm sure there's a more elegant solution, but for a one-off data fix like this, I typically Select unique copies of the duplicated records into a temp table, Delete them from the production table, and then Insert the temp table back into the production table. It's quick, it's dirty, and it lets me get on with my life.

Microsoft T-SQL - Is there a way to delete records that have duplicates?

2 Answers2