0

Suppose I have the following table:

+---------+------------+----------------+------------+-----------+------------+-----------------+
| id      |   id_2     | title          | year       | overview  |   rating   |  link           |
+---------+------------+----------------+------------+-----------+------------+-----------------+
|  000001 |   23306    |  movie title 2 | 2008       | ......    |     1.0    |  ...            |
|  000002 |   23301    |  movie title 2 | 2008       | ......    |     1.0    |  ...            |
|  000003 |   24567    |  movie title 2 | 2000       | ......    |     1.0    |  ...            |
|  000004 |   20022    |  title 100     | 2006       | ......    |     1.0    |  ...            |

As you can see, the following is true:

  • The first three rows have the same titles.
  • While the first 2 have the same title, the third row has a different year

I am trying to construct a query that will get rid of rows with duplicate titles, but keep the row with the highest id_2 number. Also, I would like to keep rows with the same title, but different years.

So far I have tried this:

SELECT  DISTINCT id, id_2, title, year
FROM table
ORDER BY id_2

But that didn't work. Does anyone know by chance how I should do this query? Or if I need to do more than one to get this to work? Sorry for being complicated, but I have very little knowledge of SQL. Thanks!

Andrew Butler
  • 1,060
  • 2
  • 14
  • 31

1 Answers1

0

You could use Group by

SELECT 
title
, year
, max(id_2) id_2
, max(id_1) id_1
from table 
group by title, year

You didnt specify a logic for id_1 so I assumed the maximum of the duplicate set.

SoulTrain
  • 1,904
  • 1
  • 12
  • 11