Group by Category in MYSQL

Question

I'm trying to work out how to get only one result per Category from a mySQL query.

We have lots of products across several categories and the aim is to get the item with the biggest product saving per category but also excluding a few categories. This should then produce a simple Top 10 savings list. The SQL for the query I have so far is:

SELECT 
    productName,
    productCategoryID,
   ((maxprice/minprice-1)*100) as PercentChange 
FROM products WHERE updatedDate > DATE_ADD(NOW(), INTERVAL -1 DAY)
    AND productCategoryID NOT IN (0,58,12,13) 
    AND (maxprice/minprice-1)*100) < '60' 
ORDER BY PercentChange DESC LIMIT 10

The query gives me a list of Top 10 products without certain categories but several of the included categories are duplicated.

Any pointers ?

Thanks,

Rick

Possible duplicate of [Select first row in each GROUP BY group?](http://stackoverflow.com/questions/3800551/select-first-row-in-each-group-by-group) — Johan, Feb 27 '17 at 15:56

score 0 · Answer 1 · answered Feb 27 '17 at 15:45

0

You could use distinct for avoid duplication

  SELECT DISTINCT  
       productName
      ,productCategoryID
      ,((maxprice/minprice-1)*100) as PercentChange 
  FROM products 
  WHERE updatedDate > DATE_ADD(NOW(), INTERVAL -1 DAY)
      AND productCategoryID NOT IN (0,58,12,13) 
      AND (maxprice/minprice-1)*100)<'60' 
  ORDER BY PercentChange DESC LIMIT 10

answered Feb 27 '17 at 15:45

ScaisEdge

131,976
10
91
107

I think the `productName` is distinct in the query already. OP seems to want a "maximum of 1 product per category". @Rick Evans ? – dognose Feb 27 '17 at 15:49
having not a group by could that some duplicate can exist but your suggestion it's interesting ., let's see if OP gives us a clear example – ScaisEdge Feb 27 '17 at 15:52
actually yes the distinct is for the rows ..but if you could explain better adding a sample i can try to improve my answer – ScaisEdge Feb 27 '17 at 15:53
Just tried that, it does duplicate the categories. Is that selecting a DISTINCT productName ? – Rick Evans Feb 27 '17 at 15:53
@RickEvans No, `Distinct` works on the whole "row". With this query, you will get unique combinations of `productName, category, percentchanged`. It would be possible to have two products from the same category. It only excludes rows that are literaly "duplicates" of existing rows with regards to the queried columns. `SELECT DISTINCT firstname, lastname` will return `John Smith` and `John AnotherSmith`. `SELECT DISTINCT firstname` will only return `John`. – dognose Feb 27 '17 at 16:02
@dognose Ah OK, so not applicable in this case as the rows are very nearly unique across the whole DB. I think the easier option may be to do it programmatically, I was hoping it could be done in a neater way :) – Rick Evans Feb 27 '17 at 16:17

dognose · Accepted Answer · 2017-02-27T16:42:17.317

Assuming you want the "top" Product per category, you would need to group on category and use an aggregation (Max in this case).

Note, that even if mysql allows to have "unaggregated" columns along with an aggregation like bellow, the result might not be unique (MSSQL for instance does not allow such a query due to the missing aggregation/grouping on productName):

SELECT 
    productName, -- unaggregated -> undetermined.
    productCategoryID, -- grouping-condition -> fine
    MAX((maxprice/minprice-1)*100) as PercentChange -- Aggregation -> fine
FROM products WHERE updatedDate > DATE_ADD(NOW(), INTERVAL -1 DAY)
    AND productCategoryID NOT IN (0,58,12,13) 
    AND (maxprice/minprice-1)*100) < '60' 
GROUP BY productCategoryID
ORDER BY PercentChange DESC LIMIT 10

Given the data is (simplified)

productName | productCategoryID | PercentChange
A             1                   50
B             1                   10
C             2                   40 
D             2                   40

only the productCategoryID and PercentChange are reliable.

The result might be

A | 1 | 50
C | 2 | 40

but due to the lacking aggregation on productName the following result might also be possible:

A | 1 | 50
D | 2 | 40

It WILL producte a single entry per category, but if two rows are equal with regards to their "grouping", the final result is not 100% predictable and therefore another approach should be taken.

You could - for example - group by name as well, and then programmatically filter out unwanted results:

SELECT 
    productName, -- grouping-condition -> fine
    productCategoryID, -- grouping-condition -> fine
    MAX((maxprice/minprice-1)*100) as PercentChange -- Aggregation -> fine
FROM products WHERE updatedDate > DATE_ADD(NOW(), INTERVAL -1 DAY)
    AND productCategoryID NOT IN (0,58,12,13) 
    AND (maxprice/minprice-1)*100) < '60' 
GROUP BY productName, productCategoryID
ORDER BY PercentChange DESC LIMIT 10

would result in

A | 1 | 50
C | 2 | 40
D | 2 | 40

Compared to a "non-grouped" query this would at least eliminate every entry that doesn't match MAX(PercentChange) (per name and category).

ps.: If you want the minimum price change, use MIN() obviously.

Group by Category in MYSQL

2 Answers2