121

The table name is "OrderDetails" and columns are given below:

OrderDetailID || ProductID || ProductName || OrderQuantity

I'm trying to select multiple columns and Group By ProductID while having SUM of OrderQuantity.

 Select ProductID,ProductName,OrderQuantity Sum(OrderQuantity)
 from OrderDetails Group By ProductID

But of course this code gives an error. I have to add other column names to group by, but that's not what I want and since my data has many items so results are unexpected that way.

Sample Data Query:

ProductID,ProductName,OrderQuantity from OrderDetails

Results are below:

 ProductID     ProductName    OrderQuantity
    1001          abc               5
    1002          abc               23    (ProductNames can be same)
    2002          xyz               8
    3004          ytp               15
    4001          aze               19
    1001          abc               7     (2nd row of same ProductID)

Expected result:

 ProductID     ProductName    OrderQuantity
    1001          abc               12    (group by productID while summing)
    1002          abc               23
    2002          xyz               8
    3004          ytp               15
    4001          aze               19

How do I select multiple columns and Group By ProductID column since ProductName is not unique?

While doing that, also get the sum of the OrderQuantity column.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ozan Ayten
  • 1,363
  • 2
  • 14
  • 20
  • You might want to look at aggregation of a string. I have no experience with this unfortunately. http://stackoverflow.com/questions/13639262/optimal-way-to-concatenate-aggregate-strings – Joe_DM Jan 20 '14 at 02:35
  • There are many ways to do this, but I think the easiest / best way is the using CTEs (Common Table Expressions). (Look it up on your favourite search engine) – BenKoshy Nov 02 '22 at 22:14

12 Answers12

166

I use this trick to group by one column when I have a multiple columns selection:

SELECT MAX(id) AS id,
    Nume,
    MAX(intrare) AS intrare,
    MAX(iesire) AS iesire,
    MAX(intrare-iesire) AS stoc,
    MAX(data) AS data
FROM Produse
GROUP BY Nume
ORDER BY Nume

This works.

Dan Swain
  • 2,910
  • 1
  • 16
  • 36
Urs Marian
  • 1,752
  • 1
  • 10
  • 3
  • 12
    Ingenious, thanks! For those passing by: you put `max()` around each ungrouped column, put `as ___` to rename it to what you want it to display, and then `group by` the columns that you want distinct that do not have `max()` around them. – Andrew Jun 08 '17 at 14:04
  • 4
    Haha, nice method to fool the SQL, but I was wondering if this works in all situation? – ngShravil.py Aug 25 '17 at 18:54
  • 57
    This doesn't make sense and can be wrong! If there are several column_B per column_A in your data, then if you group by Column_A and use MAX(Column_B) in your select to bypass the group-by limit, then its just one of those column_B values (here the greatest one as returened by MAX). It is not usually what you want! And if there is not different column_B values per column_A in your data, then you should simply add your column_B to GROUP BY clause as mentioned by other answers. @Andrew – S.Serpooshan Dec 26 '18 at 14:18
  • 4
    I agree with @Andrew – S.Serpooshan its don't work when we have different values in columns – Appesh May 24 '19 at 20:59
  • Yeah everyone uses MAX() but it's a good way to guarantee your intrare and iesire data are coming from different rows and no longer agree well enough to function as a key. – Jessica Pennell Oct 30 '19 at 16:47
  • 2
    This wouldn't work if your column type is a boolean – Hermawan Wiwid May 03 '20 at 04:58
  • is there a way that on the non grouped column where multiple different values on one ID are possible, to define which value will be shown? so e.g. the value of the last data sets? – DubZ Jul 20 '20 at 07:56
  • ok I found a solution. maybe also interesing for you @S.Serpooshan: I did a subquery instead of max(columnname) in the select with top 1 and order by – DubZ Jul 20 '20 at 08:31
  • @DubZ thanks, I'm busy now, may you post your solution to see it more clearly? – S.Serpooshan Jul 22 '20 at 09:40
  • @S.Serpooshan I posted an own answer to this yet – DubZ Jul 22 '20 at 12:47
  • Man, this looks dangerously bad. – TheRealChx101 Feb 02 '23 at 17:15
  • @S.Serpooshan You have a point there. But there are scenarios where we do not care about the contents of the other ungrouped columns. That was my case. I had a unique constraint on some columns, and I didn't care about the contents of the columns that weren't part of the unique constraint. So this worked like charm. – Lucio Mollinedo May 10 '23 at 16:16
23

I just wanted to add a more effective and generic way to solve this kind of problems. The main idea is about working with sub queries.

do your group by and join the same table on the ID of the table.

your case is more specific since your productId is not unique so there is 2 ways to solve this.

I will begin by the more specific solution: Since your productId is not unique we will need an extra step which is to select DISCTINCT product ids after grouping and doing the sub query like following:

WITH CTE_TEST AS (SELECT productId, SUM(OrderQuantity) Total
                    FROM OrderDetails
                    GROUP BY productId)
SELECT DISTINCT(OrderDetails.ProductID), OrderDetails.ProductName, CTE_TEST.Total
FROM OrderDetails 
INNER JOIN CTE_TEST ON CTE_TEST.ProductID = OrderDetails.ProductID

this returns exactly what is expected

 ProductID     ProductName         Total
    1001          abc               12    
    1002          abc               23
    2002          xyz               8
    3004          ytp               15
    4001          aze               19

But there a cleaner way to do this. I guess that ProductId is a foreign key to products table and i guess that there should be and OrderId primary key (unique) in this table.

in this case there are few steps to do to include extra columns while grouping on only one. It will be the same solution as following

Let's take this t_Value table for example:

enter image description here

If i want to group by description and also display all columns.

All i have to do is:

  1. create WITH CTE_Name subquery with your GroupBy column and COUNT condition
  2. select all(or whatever you want to display) from value table and the total from the CTE
  3. INNER JOIN with CTE on the ID(primary key or unique constraint) column

and that's it!

Here is the query

WITH CTE_TEST AS (SELECT Description, MAX(Id) specID, COUNT(Description) quantity 
                    FROM sch_dta.t_value
                    GROUP BY Description)
SELECT sch_dta.t_Value.*, CTE_TEST.quantity 
FROM sch_dta.t_Value 
INNER JOIN CTE_TEST ON CTE_TEST.specID = sch_dta.t_Value.Id

And here is the result:

enter image description here

F. Müller
  • 3,969
  • 8
  • 38
  • 49
Haithem KAROUI
  • 1,533
  • 4
  • 18
  • 39
12

Your Data

DECLARE @OrderDetails TABLE 
(ProductID INT,ProductName VARCHAR(10), OrderQuantity INT)

INSERT INTO @OrderDetails VALUES
(1001,'abc',5),(1002,'abc',23),(2002,'xyz',8),
(3004,'ytp',15),(4001,'aze',19),(1001,'abc',7)

Query

 Select ProductID, ProductName, Sum(OrderQuantity) AS Total
 from @OrderDetails 
 Group By ProductID, ProductName  ORDER BY ProductID

Result

╔═══════════╦═════════════╦═══════╗
║ ProductID ║ ProductName ║ Total ║
╠═══════════╬═════════════╬═══════╣
║      1001 ║ abc         ║    12 ║
║      1002 ║ abc         ║    23 ║
║      2002 ║ xyz         ║     8 ║
║      3004 ║ ytp         ║    15 ║
║      4001 ║ aze         ║    19 ║
╚═══════════╩═════════════╩═══════╝
S.Serpooshan
  • 7,608
  • 4
  • 33
  • 61
M.Ali
  • 67,945
  • 13
  • 101
  • 127
  • 13
    But I said, I don't want to add other column names to group by and it gives unexpected results. – Ozan Ayten Jan 19 '14 at 14:41
  • Well it shouldnt give you unexpected results unless you have multiple Productnames associated with same ProductID. if this is the case and you want to avoid this see my update – M.Ali Jan 19 '14 at 14:46
  • The query I used before does provide the expected result set you have shown with your sample data. – M.Ali Jan 19 '14 at 14:52
  • Yes there are same product names, I should put it before sorry. Edited query works but there is a slight problem. That rows are getting summed are duplicated – Ozan Ayten Jan 19 '14 at 14:52
  • yes it will produce duplicate results as you have duplicate data. One way around I can think of, if you have duplicate ProductNames Associated with same productID is you need to use SUM quantity GROUP BY ProductName and then join it to the the actual table on TableName but this will give you again Multiple rows if you want to see ProductID, ProductName and Quantity as you have duplicates in your data. – M.Ali Jan 19 '14 at 14:59
  • I don't have duplicate data tho, Let's say I have 1001,1002 productID's which is 'unique'. 1001 and 1002 can have same ProductName. So I need to group by only ProductID – Ozan Ayten Jan 19 '14 at 15:03
  • @OzanAyten yes then only group by the ProductID as I have done in my edited query. and dont join it with the table itself again because it will show the the data which you might consider to be duplicate but actually it not since it will be for different productIDs with same product name.] – M.Ali Jan 19 '14 at 15:08
  • 1
    @OzanAyten I have used your updated data with the same query and it is showing me the results as shown in your expected result set. – M.Ali Jan 19 '14 at 15:14
  • 3
    Yes but my question is clear as much as I can put it. There is too much data that would be not understandable if I put that on my question.So that's why I'm asking exclusively selecting multiple columns while group by only one column. – Ozan Ayten Jan 19 '14 at 15:22
  • This is actually the true answer, you can't explain whats wrong with it @OzanAyten. see my comment on accepted answer (UsrMarian) for more info about this... – S.Serpooshan Dec 26 '18 at 14:29
  • @OzanAyten even if you have same ProductName for different ProductID values (as abc in your sample), this solution will also work! Group By ProductId, ProductName will not change the result, because the second part of group by (ProductName) will only be applied within the first part (ProductID). this is shown by the code and its results shown here – S.Serpooshan Dec 26 '18 at 14:31
  • I've solved this question with very easy trick. But sadly, I don't remember what I did back then. Your solution is quite helpful too. – Ozan Ayten Dec 27 '18 at 17:32
5

mysql GROUP_CONCAT function could help https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html#function_group-concat

SELECT ProductID, GROUP_CONCAT(DISTINCT ProductName) as Names, SUM(OrderQuantity)
FROM OrderDetails GROUP BY ProductID

This would return:

ProductID     Names          OrderQuantity
1001          red            5
1002          red,black      6
1003          orange         8
1004          black,orange   15

Similar idea as the one @Urs Marian here posted https://stackoverflow.com/a/38779277/906265

Ivar
  • 4,350
  • 2
  • 27
  • 29
  • 1
    Very cool feature :) It looks like microsoft finally got something similar a while back, https://database.guide/the-sql-server-equivalent-to-group_concat/ – Jessica Pennell Oct 30 '19 at 17:24
2

You can try this:

Select ProductID,ProductName,Sum(OrderQuantity) 
 from OrderDetails Group By ProductID, ProductName

You're only required to Group By columns that doesn't come with an aggregate function in the Select clause. So you can just use Group By ProductID and ProductName in this case.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
har07
  • 88,338
  • 12
  • 84
  • 137
  • 4
    Same on this answer,I said, I don't want to add other column names to group by and it gives unexpected results. – Ozan Ayten Jan 19 '14 at 14:42
  • one productId can only have one related ProductName isn't it? So Group By ProductId, ProductName will give the same result as Group By ProductId in that case – har07 Jan 19 '14 at 14:49
  • ProductName isn't unique only ProductID is unique. Also, I know what you mean by your answer but in my question I exclusively asking for group by only one column. – Ozan Ayten Jan 19 '14 at 14:55
2

You can try the below query. I assume you have a single table for all your data.

SELECT OD.ProductID, OD.ProductName, CalQ.OrderQuantity
FROM (SELECT DISTINCT ProductID, ProductName
      FROM OrderDetails) OD
INNER JOIN (SELECT ProductID, OrderQuantity SUM(OrderQuantity)
            FROM OrderDetails
            GROUP BY ProductID) CalQ
ON CalQ.ProductID = OD.ProductID
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
A_B
  • 51
  • 1
  • 9
2
    WITH CTE_SUM AS (
      SELECT ProductID, Sum(OrderQuantity) AS TotalOrderQuantity 
      FROM OrderDetails GROUP BY ProductID
    )
    SELECT DISTINCT OrderDetails.ProductID, OrderDetails.ProductName, OrderDetails.OrderQuantity,CTE_SUM.TotalOrderQuantity 
    FROM 
    OrderDetails INNER JOIN CTE_SUM 
    ON OrderDetails.ProductID = CTE_SUM.ProductID

Please check if this works.

Vikram
  • 65
  • 2
  • 9
2

In my opinion this is a serious language flaw that puts SQL light years behind other languages. This is my incredibly hacky workaround. It is a total kludge but it always works.

Before I do I want to draw attention to @Peter Mortensen's answer, which in my opinion is the correct answer. The only reason I do the below instead is because most implementations of SQL have incredibly slow join operations and force you to break "don't repeat yourself". I need my queries to populate fast.

Also this is an old way of doing things. STRING_AGG and STRING_SPLIT are a lot cleaner. Again I do it this way because it always works.

-- remember Substring is 1 indexed, not 0 indexed
SELECT ProductId
  , SUBSTRING (
      MAX(enc.pnameANDoq), 1, CHARINDEX(';', MAX(enc.pnameANDoq)) - 1
    ) AS ProductName
  , SUM ( CAST ( SUBSTRING (
      MAX(enc.pnameAndoq), CHARINDEX(';', MAX(enc.pnameANDoq)) + 1, 9999
    ) AS INT ) ) AS OrderQuantity
FROM (
    SELECT CONCAT (ProductName, ';', CAST(OrderQuantity AS VARCHAR(10)))
      AS pnameANDoq, ProductID
    FROM OrderDetails
  ) enc
GROUP BY ProductId

Or in plain language :

  • Glue everything except one field together into a string with a delimeter you know won't be used
  • Use substring to extract the data after it's grouped

Performance wise I have always had superior performance using strings over things like, say, bigints. At least with microsoft and oracle substring is a fast operation.

This avoids the problems you run into when you use MAX() where when you use MAX() on multiple fields they no longer agree and come from different rows. In this case your data is guaranteed to be glued together exactly the way you asked it to be.

To access a 3rd or 4th field, you'll need nested substrings, "after the first semicolon look for a 2nd". This is why STRING_SPLIT is better if it is available.

Note : While outside the scope of your question this is especially useful when you are in the opposite situation and you're grouping on a combined key, but don't want every possible permutation displayed, that is you want to expose 'foo' and 'bar' as a combined key but want to group by 'foo'

Jessica Pennell
  • 578
  • 4
  • 14
1

==EDIT==

I checked your question again and have concluded this can't be done.

ProductName is not unique, It must either be part of the Group By or excluded from your results.

For example how would SQL present these results to you if you Group By only ProductID?

ProductID | ProductName | OrderQuantity 
---------------------------------------
1234      | abc         | 1
1234      | def         | 1
1234      | ghi         | 1
1234      | jkl         | 1
Joe_DM
  • 985
  • 1
  • 5
  • 12
  • I'm using sql and first code block gives syntax errors. Also, I don't want to add other columns to Group By. – Ozan Ayten Jan 19 '14 at 14:47
  • I've attached the only way I can think to do this without grouping by two items. The issue is if you group by a number there is no way to select the corresponding string without performing some aggregation on it. The answer posted by @har07 looks like the best option. For example if two items have the same OrderQuantity but a different ProductName the Server has no idea which ProductName to show you. Hope this makes sense. – Joe_DM Jan 19 '14 at 15:02
  • I want to merge and sum OrderQuantity of rows of same ProductID :) Also I know why this won't work. It all makes sense but is it really not possible. – Ozan Ayten Jan 19 '14 at 15:11
  • I just noticed this puts you back at square one... What is wrong with the results you are getting? Maybe your query is exactly what you want and just formatted wrong? – Joe_DM Jan 19 '14 at 15:14
  • Sorry but as I stated on my question I need other columns too. I know how to group by a little. I could do this by myself but my question is different. I've edited my question, please read the last phase – Ozan Ayten Jan 19 '14 at 15:15
  • If the problem is that you want the OrderID to be displayed then this is not possible as multiple orders will have different OrderID however be grouped by the ProductID. Could you add more clarification to what you are trying to achieve which you are not getting with the query you posted? – Joe_DM Jan 19 '14 at 15:18
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/45579/discussion-between-ozan-ayten-and-joe-dm) – Ozan Ayten Jan 19 '14 at 15:24
0

I had a similar problem to the OP. Then I saw the answer from @Urs Marian which helped a lot. But additionally what I was looking for is, when there are multiple values in a column and they will be grouped, how I can get the last submitted value (e.g. ordered by a date/id column).

Example:

We have following table structure:

CREATE TABLE tablename(
    [msgid] [int] NOT NULL,
    [userid] [int] NOT NULL,
    [username] [varchar](70) NOT NULL,
    [message] [varchar](5000) NOT NULL
) 

Now there are at least two datasets in the table:

+-------+--------+----------+---------+
| msgid | userid | username | message |
+-------+--------+----------+---------+
|     1 |      1 | userA    | hello   |
|     2 |      1 | userB    | world   |
+-------+--------+----------+---------+

Therefore following SQL script does work (checked on MSSQL) to group it, also if the same userid has different username values. In the example below, the username with the highest msgid will be shown:

SELECT m.userid, 
(select top 1 username from table where userid = m.userid order by msgid desc) as username,
count(*) as messages
FROM tablename m
GROUP BY m.userid
ORDER BY count(*) DESC
DubZ
  • 580
  • 3
  • 12
-1

The elegant way of having your desired results, is by using 'over (partion by)' sql clause in the following way:

SELECT ProductID,ProductName,OrderQuantity
    ,SUM(OrderQuantity)     OVER(PARTITION BY ProductID) AS 'Total'
    --,AVG(OrderQuantity)   OVER(PARTITION BY ProductID) AS 'Avg'
    --,COUNT(OrderQuantity) OVER(PARTITION BY ProductID) AS 'Count'
    --,MIN(OrderQuantity)   OVER(PARTITION BY ProductID) AS 'Min'
    --,MAX(OrderQuantity)   OVER(PARTITION BY ProductID) AS 'Max'
FROM OrderDetails
Amirkhm
  • 948
  • 11
  • 13
-3

SELECT ProductID, ProductName, OrderQuantity, SUM(OrderQuantity) FROM OrderDetails WHERE(OrderQuantity) IN(SELECT SUM(OrderQuantity) FROM OrderDetails GROUP BY OrderDetails) GROUP BY ProductID, ProductName, OrderQuantity;

I used the above solution to solve a similar problem in Oracle12c.

mrfournier
  • 11
  • 2