0

I'm working on a webapp that tracks tvshows, and I need to get all episodes id's that are season finales, which means, the highest episode number from all seasons, for all tvshows.

This is a simplified version of my "episodes" table.

 id   tvshow_id   season   epnum
 ---|-----------|--------|-------
 1  |     1     |    1   |   1
 2  |     1     |    1   |   2
 3  |     1     |    1   |   3
 4  |     1     |    2   |   1
 5  |     1     |    2   |   2
 6  |     2     |    1   |   1
 7  |     2     |    1   |   2
 8  |     2     |    1   |   3
 9  |     2     |    1   |   4
10  |     2     |    2   |   1
11  |     2     |    2   |   2

The expect output:

 id
 ---|
 3  |
 5  |
 9  |
11  |

I've managed to get this working for the latest season but I can't make it work for all seasons.

I've also tried to take some ideas from this but I can't seem to find a way to add the tvshow_id in there.

I'm using Postgres v10

FrankPl
  • 573
  • 4
  • 19
nip
  • 1,609
  • 10
  • 20
  • in which database? – Vivek Jun 29 '18 at 21:32
  • It's in the end of the question: "I'm using Postgres v10" – nip Jun 29 '18 at 21:33
  • 1
    If you say you already got parts of it working, it'd be good to see your source code. It's easier to fix a small error than writing it all anew. – Matthias Bö Jun 29 '18 at 21:34
  • I don't think it will be useful because its limited to the latest season and the tvshow_id is hardcoded, so the correct answer is unlikely to use any of this.. but here you go https://pastebin.com/DdVQMTx2 – nip Jun 29 '18 at 21:38

3 Answers3

2

You can use the below SQL to get your result, using GROUP BY with sub-subquery as:

select id from tab_x
where (tvshow_id,season,epnum) in (
select tvshow_id,season,max(epnum)
from tab_x
group by tvshow_id,season)
Vivek
  • 783
  • 5
  • 11
2
SELECT Id from
(Select *, Row_number() over (partition by tvshow_id,season order by epnum desc) as ranking from tbl)c
Where ranking=1
Ajay Gupta
  • 1,775
  • 1
  • 10
  • 22
  • How would this relate to @Vivek 's answer in terms of performance ? I got the same result (both output and operation time) and the test table that im using is quite small (~150 rows), so I dont have an idea. – nip Jun 29 '18 at 21:49
  • 1
    @nip I'm not sure about PostgreS, but window functions are usually very performant in most databases I've used. Plus it keeps things set-based. Also, `WHERE x IN (1,2,3)` is pretty much `WHERE x = 1 OR x=2 OR x=3`. For large lists, that can get pretty slow. – Shawn Jun 29 '18 at 22:25
  • @Shawn Well, now I'm concerced because this table is going to get somewhat big and I've added another WHERE IN to filter the tvshow_id's, since I only want the season finales from a certain number of tvshows. How would this work (the filtering) with this answer without using WHERE IN ? Is this even possible ? – nip Jun 29 '18 at 23:21
  • @nip I'd use the `ROW_NUMBER()` method. Use Ajay Gupta's answer. http://www.postgresqltutorial.com/postgresql-row_number/ – Shawn Jul 03 '18 at 00:32
0

Below is the simple query to get desired result. Below query is also good in performance with help of using distinct on() clause

  select
   distinct on (tvshow_id,season)
   id
  from your_table
  order by tvshow_id,season ,epnum desc
Sabari
  • 234
  • 2
  • 6