duplicates rows in database

Question

We have done a translations api were our users can add translations to the system. A bug in the api added duplicate rows which i need to remove.

The translationConnection_Id combined with TranslationCompanyKey_Id is a key and therefor should not be able to be duplicate. I since i'm a real sucker at SQL i need some help to create a script to remove all duplicates but saves one of the rows.

SELECT TOP 1000 [Id]
  ,[Value]
  ,[TranslationConnection_Id]
  ,[TranslationCompanyKey_Id]
 FROM [AAES_TRAN].[dbo].[Translations]

Possible duplicate of [How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows) — Reisclef, Dec 20 '16 at 14:13
You have 2 different values in the `VALUE` field. How do you determine which to keep? — JohnHC, Dec 20 '16 at 14:13
I do only need to save one of them, doesn't matter which one :) — Daniel Gustafsson, Dec 20 '16 at 14:15

score 5 · Accepted Answer · answered Dec 20 '16 at 14:16

5

I think this will work. Try the SELECT part first to be sure.

DELETE FROM tblTranslations WHERE ID IN (
    SELECT MAX(ID) FROM tblTranslations
    GROUP BY TranslationConnection_Id, TranslationCompanyKey_Id
    HAVING COUNT(*) > 1
)

answered Dec 20 '16 at 14:16

idstam

2,848
1
21
30

Thanks! Works perfect! – Daniel Gustafsson Dec 20 '16 at 14:20

score 0 · Answer 2 · answered Dec 20 '16 at 14:22

You can use ROW_NUMBER() analytic function for this like

SELECT * FROM (
SELECT TOP 1000 [Id]
  ,[Value]
  ,[TranslationConnection_Id]
  ,[TranslationCompanyKey_Id]
  ,ROW_NUMBER() OVER(PARTITION BY [Value] ORDER BY [Value]) AS rn
 FROM [AAES_TRAN].[dbo].[Translations] ) xxx
WHERE rn = 1;

score 0 · Answer 3 · answered Dec 20 '16 at 15:12

I think you would like to keep one record from the duplicate records.

This code will delete duplicate records but will keep one records. Try this:

DELETE FROM TestTranslationTable WHERE ID IN(
     SELECT Id FROM TestTranslationTable tblMain
     WHERE tblMain.Id NOT IN (SELECT TOP 1 tblNested.id FROM TestTranslationTable tblNested WHERE 
     tblNested.TranslationConnectionId=tblMain.TranslationConnectionId ORDER BY tblNested.TranslationConnectionId)
     AND tblMain.TranslationConnectionId IN(SELECT TranslationConnectionId FROM TestTranslationTable 
     GROUP BY TranslationConnectionId HAVING COUNT(TranslationConnectionId)>1 )
     )
     GO

Ref: https://jwcooney.com/2011/11/18/sql-server-delete-duplicate-records-except-the-first-duplicate-record/

duplicates rows in database

3 Answers3