0

I have a couple of tables and I want to display some data from one table, and aggregate the data from another. I cannot do this, however, because including the column I want to display in the GROUP BY will actually mess up the resulting data. Is there no way to avoid this limitation? Seems kind of silly.

A few queries that I've tried:

SELECT
    Shows.id,
    Shows.title,
    CAST(AVG(Reviews.rating) AS DECIMAL(2,1)) AS 'Average Rating'
FROM Shows JOIN Reviews
    ON Shows.id = Reviews.shows_id
GROUP BY Shows.id, title;

SELECT title, CAST(AVG(Reviews.rating) AS DECIMAL(2,1)) AS 'Average Rating'
FROM Shows JOIN Reviews
    ON Shows.id = Reviews.shows_id
WHERE EXISTS (
    SELECT title, CAST(AVG(Reviews.rating) AS DECIMAL(2,1))
    FROM Shows JOIN Reviews
        ON Shows.id = Reviews.shows_id
    GROUP BY Shows.id, title
)
ORDER BY [Average Rating] DESC;

Ideally I want to be able to do this:

SELECT title, CAST(AVG(Reviews.rating) AS DECIMAL(2,1)) AS 'Average Rating'
FROM Series JOIN Reviews
    ON Series.id = Reviews.series_id
GROUP BY Series.id
ORDER BY [Average Rating] DESC;

Because let's say the data I have is:

1   Archer  2009    Animation
2   Arrested Development    2003    Comedy
3   Bob's Burgers   2011    Animation
4   Bojack Horseman 2014    Animation
5   Breaking Bad    2008    Drama
6   Curb Your Enthusiasm    2000    Comedy
7   Fargo   2014    Drama
8   Freaks and Geeks    1999    Comedy
9   General Hospital    1963    Drama
10  Halt and Catch Fire 2014    Drama
11  Malcolm In The Middle   2000    Comedy
12  Pushing Daisies 2007    Comedy
13  Seinfeld    1989    Comedy
14  Stranger Things 2016    Drama
15  Halt and Catch Fire 2015    Drama

Where "Halt and Catch Fire" has two entries, but one has a release_year of 2015 instead of 2014 -- but I want BOTH to show up in the end, with separate Average Ratings -- I apparently can't do this because to include both the title and [Average Rating], the title has to be in the GROUP BY, and if I include the title in the GROUP BY, it will lump the second "Halt and Catch Fire" in with the first one, and I won't get accurate Average Ratings.

Is there any way around this? If I try to run the "ideal" query, I get: Column 'Series.title' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Which, especially if I'm grouping by the Primary key, I think the limitation is a little silly.

GrowingCode247
  • 534
  • 1
  • 3
  • 15

2 Answers2

0

I am by no means an expert, but I think an OVER clause might do the trick. Per this answer, an over clause with a PARTITION BY could help prevent the undesired lumping that you are seeing.

Escherichia
  • 65
  • 1
  • 7
0

If you want the average rating calculated separately for the two different series, add the series and year to the group by clause.

SELECT title, releaser_year, CAST(AVG(Reviews.rating) AS DECIMAL(2,1)) AS 'Average Rating'
FROM Series JOIN Reviews
    ON Series.id = Reviews.series_id
GROUP BY Series.id, Series.Title, Series.release_year
ORDER BY [Average Rating] DESC;

Though I think you have a misconception about the grouping clause if you think that adding the title will group both records together; it won't. It would group the id and title together, so you would get two rows.

Now if you want to average both series together, but display two rows, that's a slightly different problem, and you could utilise windowing functions; though you will then need to run a distinct clause to deduplicate the result set as aggregation with windowing would return a row per review.

SELECT DISTINCT title, 
release_year, 
AVG(reviews.rating) OVER (PARTITION BY title) as [Average Rating]
FROM Series JOIN Reviews
    ON Series.id = Reviews.series_id
ORDER BY [Average Rating] DESC;
Matthew Darwin
  • 325
  • 1
  • 10