Select max value from column for every value in other two columns

Question

I'm working on a webapp that tracks tvshows, and I need to get all episodes id's that are season finales, which means, the highest episode number from all seasons, for all tvshows.

This is a simplified version of my "episodes" table.

 id   tvshow_id   season   epnum
 ---|-----------|--------|-------
 1  |     1     |    1   |   1
 2  |     1     |    1   |   2
 3  |     1     |    1   |   3
 4  |     1     |    2   |   1
 5  |     1     |    2   |   2
 6  |     2     |    1   |   1
 7  |     2     |    1   |   2
 8  |     2     |    1   |   3
 9  |     2     |    1   |   4
10  |     2     |    2   |   1
11  |     2     |    2   |   2

The expect output:

 id
 ---|
 3  |
 5  |
 9  |
11  |

I've managed to get this working for the latest season but I can't make it work for all seasons.

I've also tried to take some ideas from this but I can't seem to find a way to add the tvshow_id in there.

I'm using Postgres v10

If you say you already got parts of it working, it'd be good to see your source code. It's easier to fix a small error than writing it all anew. — Matthias Bö, Jun 29 '18 at 21:34
I don't think it will be useful because its limited to the latest season and the tvshow_id is hardcoded, so the correct answer is unlikely to use any of this.. but here you go https://pastebin.com/DdVQMTx2 — nip, Jun 29 '18 at 21:38

score 2 · Answer 1 · answered Jun 29 '18 at 21:40

2

You can use the below SQL to get your result, using GROUP BY with sub-subquery as:

select id from tab_x
where (tvshow_id,season,epnum) in (
select tvshow_id,season,max(epnum)
from tab_x
group by tvshow_id,season)

answered Jun 29 '18 at 21:40

Vivek

783
5
11

score 2 · Accepted Answer · answered Jun 29 '18 at 21:42

2

SELECT Id from
(Select *, Row_number() over (partition by tvshow_id,season order by epnum desc) as ranking from tbl)c
Where ranking=1

answered Jun 29 '18 at 21:42

Ajay Gupta

1,775
1
10
22

How would this relate to @Vivek 's answer in terms of performance ? I got the same result (both output and operation time) and the test table that im using is quite small (~150 rows), so I dont have an idea. – nip Jun 29 '18 at 21:49
1

@nip I'm not sure about PostgreS, but window functions are usually very performant in most databases I've used. Plus it keeps things set-based. Also, `WHERE x IN (1,2,3)` is pretty much `WHERE x = 1 OR x=2 OR x=3`. For large lists, that can get pretty slow. – Shawn Jun 29 '18 at 22:25
@Shawn Well, now I'm concerced because this table is going to get somewhat big and I've added another WHERE IN to filter the tvshow_id's, since I only want the season finales from a certain number of tvshows. How would this work (the filtering) with this answer without using WHERE IN ? Is this even possible ? – nip Jun 29 '18 at 23:21
@nip I'd use the `ROW_NUMBER()` method. Use Ajay Gupta's answer. http://www.postgresqltutorial.com/postgresql-row_number/ – Shawn Jul 03 '18 at 00:32

Sabari · Answer 3 · 2018-07-01T02:56:36.457

0

Below is the simple query to get desired result. Below query is also good in performance with help of using distinct on() clause

  select
   distinct on (tvshow_id,season)
   id
  from your_table
  order by tvshow_id,season ,epnum desc

edited Jul 01 '18 at 02:56

answered Jul 01 '18 at 02:51

Sabari

234
2
6

Select max value from column for every value in other two columns

3 Answers3