1

This is a MySQL question. I have three tables with the following columns:

transactions (table): transact_id, customer_id, transact_amt, product_id,
products (table): product_id, product_cost, product_name, product_category
customers (table): customer_id, joined_at, last_login_at, state, name, email

I'd like a query that finds out the most popular item in every state and the state. One of the tricky parts is that some product_name have multiple product_id. Therefore I though joining the three tables that generate an output with two columns: state and product_name. Until here that worked fine doing this:

SELECT p.product_name, c.state
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id

This selects all the products, and the states from where the customer is. The problem is that I can't find the way to rank the mos popular product per state. I tried different group by, order by and using subqueries without success. I suspect I need to do subqueries, but I can't find the way to resolve it. The expected outcome should look like this:

 most_popular_product | state
 Bamboo               | WA
 Walnut               | MO

Any help will be greatly appreciated.

Thank you!

pars1221
  • 27
  • 3
  • "The problem is that I can't find the way to rank the most popular product per state." If you have MySQL 8,0+ you can use windows functions like `ROW_NUMBER()` or `RANK()` to generate a row_number within a dataset... In lower MySQL version you can simulate ROW_NUMBER with MySQL's user variables.. Search on stackoverflow iam sure this question has a duplicate somewhere. – Raymond Nijland Sep 20 '18 at 21:46
  • @RaymondNijland How are you expecting row number to be helpful for this? Maybe window functions would be useful, but I still haven't learned them because we don't have MySQL 8 where I work. – Barmar Sep 21 '18 at 14:13
  • "How are you expecting row number to be helpful for this? Maybe window functions would be useful" i geuss because ive readed " The problem is that I can't find the way to **rank** the mos popular product per state." @Barmar .. And because of missing good example data and expected result i made a geuss of a window function or simulating one. – Raymond Nijland Sep 21 '18 at 14:26

1 Answers1

1

You need a subquery that gets the count of transactions for each product in each state.

SELECT p.product_name, c.state, COUNT(*) AS count
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id
GROUP BY p.product_name, c.state

Then write another query that has this as a subquery, and gets the highest count for each state.

SELECT state, MAX(count) AS maxcount
FROM (
    SELECT p.product_name, c.state, COUNT(*) AS count
    FROM products p
    INNER JOIN transactions t
    ON p.product_id = t.product_id
    INNER JOIN customers c
    ON c.customer_id = t.customer_id
    GROUP BY p.product_name, c.state
) AS t
GROUP BY state

Finally, join them together:

SELECT t1.product_name AS most_popular_product, t1.state
FROM (
    SELECT p.product_name, c.state, COUNT(*) AS count
    FROM products p
    INNER JOIN transactions t
    ON p.product_id = t.product_id
    INNER JOIN customers c
    ON c.customer_id = t.customer_id
    GROUP BY p.product_name, c.state
) AS t1
JOIN (
    SELECT state, MAX(count) AS maxcount
    FROM (
        SELECT p.product_name, c.state, COUNT(*) AS count
        FROM products p
        INNER JOIN transactions t
        ON p.product_id = t.product_id
        INNER JOIN customers c
        ON c.customer_id = t.customer_id
        GROUP BY p.product_name, c.state
    ) AS t
    GROUP BY state
) AS t2 ON t1.state = t2.state AND t1.count = t2.maxcount

This is basically the same pattern as SQL select only rows with max value on a column, just using the first grouped query as the table you're trying to group.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thank you so much @Barmar! I've been with this trying to figure it out for many hours, since two days. It helped me resolve this query, and clarifies concepts for other queries. Thank you very much! – pars1221 Sep 20 '18 at 22:30