0

I have a table news with 48 columns

The table has some values like so:

ID|title            |date      |.......
 1|Apple iphone 6   |2014-08-23
 2|Samsung Galaxy s5|2014-08-23
 3|LG G3            |2014-08-25
 4|Apple iphone 6   |2014-08-25
 5|HTC One m8       |2014-08-27

The "title" value is duplicated in id 1 and 4 (Apple iphone 6)

I want to keep the just last ID row in news table and delete the other older rows. So in the above example I want to delete row with the ID 1, keep the last id (4) which has the same title column value.

Unihedron
  • 10,902
  • 13
  • 62
  • 72
Ozy
  • 9
  • 2

4 Answers4

4
delete from news
where id not in
(
  select * from 
  (
     select max(id) 
     from news
     group by title
  ) tmp
)

This query groupy by the title and selects the latest id for every unique title. Then it deletes all records that have NOT that id.

I added another subquery because in MySQL you can't select from a table you are deleting at the same time.

juergen d
  • 201,996
  • 37
  • 293
  • 362
2

Tricky as it is easy to remove non duplicates

1 find the duplicates 2 determine the oldest id's 3 remove only those!

delete from news where id in (
   select min(id) from news 
   group by title
   having title in (
      select title from news group by title having count(title) > 1
   )
)
lboshuizen
  • 2,746
  • 17
  • 20
1

this is the most explicit way to do it:

DELETE n FROM news n INNER JOIN  news nn ON  n.title = nn.title AND n.id < nn.id;

http://sqlfiddle.com/#!2/02255f/2

Bulat
  • 6,869
  • 1
  • 29
  • 52
  • I run this query but it still runs for about 10min, didnt completed and i killed the query. I have about 250K records in news table so is it very slow query right? And no usefull for me – Ozy Aug 27 '14 at 14:55
  • What about the answer by @juergen? – Bulat Aug 27 '14 at 14:57
  • @juergen 's query deleted all duplicated rows. Oldest and newest ids too.. – Ozy Aug 31 '14 at 10:31
  • http://sqlfiddle.com/#!2/a5dc7/1 not according to this – Bulat Aug 31 '14 at 12:00
0

In MySQL, I would do this using a join:

delete n
   from news n join
        (select title, max(id) as keepid
         from news
         group by title
        ) ti
        on ti.title = n.title and ti.id < keepid;

MySQL is finicky in delete and update statements about referring to the table being modified. Unfortunately, the typical way to do this in ANSI SQL (and other databases) doesn't quite work:

delete from news
    where id < (select max(id) from news n2 where n2.title = n.title);

You can get around this with a MySQL hack of using an extra layer of subqueries.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786