Query to remove duplicate rows, but with distinct columns?

Question

Suppose I have the following table:

+---------+------------+----------------+------------+-----------+------------+-----------------+
| id      |   id_2     | title          | year       | overview  |   rating   |  link           |
+---------+------------+----------------+------------+-----------+------------+-----------------+
|  000001 |   23306    |  movie title 2 | 2008       | ......    |     1.0    |  ...            |
|  000002 |   23301    |  movie title 2 | 2008       | ......    |     1.0    |  ...            |
|  000003 |   24567    |  movie title 2 | 2000       | ......    |     1.0    |  ...            |
|  000004 |   20022    |  title 100     | 2006       | ......    |     1.0    |  ...            |

As you can see, the following is true:

The first three rows have the same titles.
While the first 2 have the same title, the third row has a different year

I am trying to construct a query that will get rid of rows with duplicate titles, but keep the row with the highest id_2 number. Also, I would like to keep rows with the same title, but different years.

So far I have tried this:

SELECT  DISTINCT id, id_2, title, year
FROM table
ORDER BY id_2

But that didn't work. Does anyone know by chance how I should do this query? Or if I need to do more than one to get this to work? Sorry for being complicated, but I have very little knowledge of SQL. Thanks!

Use one of the solutions in the duplicate question, with `GROUP BY title, year` — Barmar, Feb 11 '15 at 23:16

score 0 · Answer 1 · answered Feb 11 '15 at 23:17

0

You could use Group by

SELECT 
title
, year
, max(id_2) id_2
, max(id_1) id_1
from table 
group by title, year

You didnt specify a logic for id_1 so I assumed the maximum of the duplicate set.

answered Feb 11 '15 at 23:17

SoulTrain

1,904
1
12
11

Query to remove duplicate rows, but with distinct columns?

1 Answers1