Remove 1 instance of duplicate values T-SQL

Question

I currently have a table called People. Within this table there are thousands of rows of data which follow the below layout:

gkey |    Name |  Date       | Person_Id
1    |    Fred |  12/05/2012 | ABC123456
2    |    John |  12/05/2012 | DEF123456
3    |    Dave |  12/05/2012 | GHI123456
4    |    Fred |  12/05/2012 | JKL123456
5    |    Leno |  12/05/2012 | ABC123456

If I execute the following:

SELECT [PERSON_ID], COUNT(*) TotalCount
FROM [Database].[dbo].[People]
GROUP BY [PERSON_ID]
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

I get a return of:

Person_Id     | TotalCount
ABC123456     | 2

Now I would like to remove just one row of the duplicate values so when I execute the above query I return no results. Is this possible?

I can't see duplicate value, Would you like to remove Fred or Leno (who have the same Person_ID) ? Is there a rule for taking such decision ? — Imane Fateh, Jul 04 '13 at 11:12

score 7 · Accepted Answer · answered Jul 04 '13 at 11:17

7

WITH a as
(
SELECT row_number() over (partition by [PERSON_ID] order by name) rn
FROM [Database].[dbo].[People]
)
DELETE FROM a
WHERE rn = 2

answered Jul 04 '13 at 11:17

t-clausen.dk

43,517
12
59
92

score 0 · Answer 2 · answered Jul 04 '13 at 11:11

0

Try this

DELETE FROM [People]
WHERE gkey IN
(
   SELECT MIN(gkey)
   FROM [People]
    GROUP BY [PERSON_ID]
   HAVING COUNT(*) > 1
)

You can use either MIN or Max

answered Jul 04 '13 at 11:11

bvr

4,786
1
20
24

score 0 · Answer 3 · answered Jul 04 '13 at 11:20

DELETE FROM PEOPLE WHERE gkey=
(SELECT MAX(TABLE1.gkey) FROM (SELECT P.gkey, A.PERSON_ID,A.TotalCount FROM People P,(SELECT [PERSON_ID], COUNT(*) TotalCount
FROM [Database].[dbo].[People]
GROUP BY [PERSON_ID]
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC) A WHERE P.gkey=A.gkey) TABLE1 ) TABLE2

score 0 · Answer 4 · answered Jul 04 '13 at 11:22

0

Use Top keyword with delete;

DELETE TOP(1) FROM [People]
WHERE Person_Id IN
(
   SELECT MIN([PERSON_ID])
   FROM [People]
   GROUP BY [PERSON_ID]
   HAVING COUNT(*) > 1

)

The query is same as posted by Vassy with just Top(1) addded...

answered Jul 04 '13 at 11:22

Yudhbir Sajwan

31
1

Remove 1 instance of duplicate values T-SQL

4 Answers4