0

I need to find the most popular name per year from the below data based on the combined total count for a name each year. Note there can be multiple entries per year (as seen below).

ID  person_name total_count person_year
1   MIKE    1   2006
2   MIKE    2   2007
3   MIKE    4   2007
4   MIKE    3   2008
5   TED     1   2006
6   TED     2   2007
7   TED     4   2008
8   TED     7   2008
9   MOOKIE  1   2006
10  MOOKIE  12  2006
11  MOOKIE  5   2007
12  MOOKIE  3   2008

The SQL I need to write would produce the below result:

person_name   max_value person_year
MOOKIE        13        2006
MIKE          6         2007
TED           11        2008

Creating the SUM table is easy:

SELECT id, person_name,SUM(total_count) AS sum_count, person_year FROM temp_table GROUP BY person_name, person_year;

This gives me the Sum count per year for each name.

The problem is any MAX logic I write doesn't carry the associated NAME with the selected MAX when I group by YEAR. I've tried numerous variations and none of them work. I would have thought the below would work, but the NAME is mismatched:

    SELECT id, person_name, MAX(sum_count) AS max_count, person_year FROM 
        (SELECT id, person_name, SUM(total_count) AS sum_count, person_year FROM temp_table GROUP BY person_name, person_year) AS PC
        GROUP BY person_year;

It returns:

1   MIKE    13  2006
2   MIKE    6   2007
4   MIKE    11  2008

So I don't know how to map the selected MAX grouped by YEAR to the proper name... That's the only piece I'm missing.

Any help on this would be appreciated.

Floobinator
  • 388
  • 2
  • 11
  • 1
    Write a subquery that gets the total for each name in each year. Then use that subquery with one of the solutions in [Get rows with max value of a column](https://stackoverflow.com/questions/7745609/sql-select-only-rows-with-max-value-on-a-column?rq=1) – Barmar Sep 02 '20 at 20:51
  • 1
    If you're using MySQL 8.x, you can use a CTE for the subquery. In 5.x define a view. – Barmar Sep 02 '20 at 20:52
  • Show us the parts of the problem that you *can* solve, and see https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query. – Strawberry Sep 02 '20 at 20:55
  • @Barmar I already did that: SELECT id, person_name,SUM(total_count) AS sum_count, person_year FROM temp_table GROUP BY person_name, person_year; But I cannot figure out how to do the aggregation of selected name with max value per year. And I've looked at every example provided herein and still can't figure it out. – Floobinator Sep 02 '20 at 21:04
  • 1
    See [this answer](https://stackoverflow.com/a/1313140/1491895). Define a view or CTE named `messages` that contains your query. – Barmar Sep 02 '20 at 21:08
  • @Barmar I already reviewed that and the problem is I don't know what to inner join on - I've looked at a dozen inner join examples that are "similar" to this, but haven't been able to get any of them to work properly. Note I updated the question and provided more information above. – Floobinator Sep 02 '20 at 21:10

1 Answers1

1

First write a query to get the total for each name in each year:

SELECT person_name, person_year, SUM(total_count) AS count
FROM temp_table
GROUP BY person_name, person_year

Then use that as a CTE in a query to find the row with the max value for each year:

WITH counts AS (
    SELECT person_name, person_year, SUM(total_count) AS count
    FROM temp_table
    GROUP BY person_name, person_year)
SELECT c1.*
FROM counts AS c1
JOIN (
    SELECT person_year, MAX(count) AS max_count
    FROM counts
    GROUP BY person_year) AS c2
ON c1.person_year = c2.person_year AND c1.count = c2.max_count

DEMO

This follows the same pattern as this answer, it simply uses a CTE instead of a real table.

If you're using MySQL 5.x, define a view instead of using the CTE.

Without a CTE or view, you have to substitute the entire subquery everywhere that counts appears above.

SELECT c1.*
FROM (
    SELECT person_name, person_year, SUM(total_count) AS count
    FROM temp_table
    GROUP BY person_name, person_year) AS c1
JOIN (
    SELECT person_year, MAX(count) AS max_count
    FROM (
        SELECT person_name, person_year, SUM(total_count) AS count
        FROM temp_table
        GROUP BY person_name, person_year) AS x
    GROUP BY person_year) AS c2
ON c1.person_year = c2.person_year AND c1.count = c2.max_count
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • First, you are awesome. Thank you very much. I looked at that thread and it was overwhelming (trying to pick out the details). You simplified it. So without CTE what I'm trying to do would require a temporary table or view, correct? Was there a way to do this with a single query without CTE support? (just curious) Thank you for steering me in the right direction on this... – Floobinator Sep 02 '20 at 21:43
  • 1
    Just replace every instance of `counts` with the entire subquery. – Barmar Sep 02 '20 at 21:48
  • 1
    If you want to use a temporary table, you have to create two of them, because MySQL doesn't allow you to use the same temp table multiple times in a query. – Barmar Sep 02 '20 at 21:50
  • Understood. Thank you. I've been doing database architecture for so long now, I haven't gotten into the weeds with aggregated reporting sub query solutions in over ten years. Quite rusty with it. Thanks again for pointing me in the right direction and helping me solve this issue! – Floobinator Sep 02 '20 at 22:00