7

I currently have a table called People. Within this table there are thousands of rows of data which follow the below layout:

gkey |    Name |  Date       | Person_Id
1    |    Fred |  12/05/2012 | ABC123456
2    |    John |  12/05/2012 | DEF123456
3    |    Dave |  12/05/2012 | GHI123456
4    |    Fred |  12/05/2012 | JKL123456
5    |    Leno |  12/05/2012 | ABC123456

If I execute the following:

SELECT [PERSON_ID], COUNT(*) TotalCount
FROM [Database].[dbo].[People]
GROUP BY [PERSON_ID]
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

I get a return of:

Person_Id     | TotalCount
ABC123456     | 2

Now I would like to remove just one row of the duplicate values so when I execute the above query I return no results. Is this possible?

LaLa
  • 301
  • 2
  • 17
  • I can't see duplicate value, Would you like to remove Fred or Leno (who have the same Person_ID) ? Is there a rule for taking such decision ? – Imane Fateh Jul 04 '13 at 11:12

4 Answers4

7
WITH a as
(
SELECT row_number() over (partition by [PERSON_ID] order by name) rn
FROM [Database].[dbo].[People]
)
DELETE FROM a
WHERE rn = 2
t-clausen.dk
  • 43,517
  • 12
  • 59
  • 92
0

Try this

DELETE FROM [People]
WHERE gkey IN
(
   SELECT MIN(gkey)
   FROM [People]
    GROUP BY [PERSON_ID]
   HAVING COUNT(*) > 1
)

You can use either MIN or Max

bvr
  • 4,786
  • 1
  • 20
  • 24
0
DELETE FROM PEOPLE WHERE gkey=
(SELECT MAX(TABLE1.gkey) FROM (SELECT P.gkey, A.PERSON_ID,A.TotalCount FROM People P,(SELECT [PERSON_ID], COUNT(*) TotalCount
FROM [Database].[dbo].[People]
GROUP BY [PERSON_ID]
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC) A WHERE P.gkey=A.gkey) TABLE1 ) TABLE2
Mariappan Subramanian
  • 9,527
  • 8
  • 32
  • 33
0

Use Top keyword with delete;

DELETE TOP(1) FROM [People]
WHERE Person_Id IN
(
   SELECT MIN([PERSON_ID])
   FROM [People]
   GROUP BY [PERSON_ID]
   HAVING COUNT(*) > 1

)

The query is same as posted by Vassy with just Top(1) addded...