1

i have a table that somehow got duplicated. i basically want to delete all records that are duplicates, which is defined by a field in my table called SourceId. There should only be one record for each source ID.

is there any SQL that i can write that will delete every duplicate so i only have one record per Sourceid ?

leora
  • 188,729
  • 360
  • 878
  • 1,366

2 Answers2

3

Assuming you have a column ID that can tie-break the duplicate sourceid's, you can use this. Using min(id) causes it to keep just the min(id) per sourceid batch.

delete from tbl
where id NOT in
(
select  min(id)
from tbl
group by sourceid
)
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • I don't think this is correct. an id could appear in the subquery if it's the min(id) from some other sourceid. I think you need a correlated subquery to be only the min(id) you are looking for. – Walter Mitty Feb 10 '11 at 15:05
  • @Walter the assumption here is that ID is unique across the table. – RichardTheKiwi Feb 10 '11 at 17:35
1
delete from table
where pk in (
select i2.pk
from table i1
  inner join table i2
   on i1.SourceId = i2.SourceId
)

good practice is to start with select * from … and only later replace to delete from …

bw_üezi
  • 4,483
  • 4
  • 23
  • 41