1

I would like to find the total purchase for each customer then return the highest value by customer category.

For now, I'm just able to have the total purchase for each customer

SELECT  c.CustomerID,
        c.CustomerName,
        cat.CustomerCategoryName,
        SUM(p.Quantity*p.UnitPrice) AS TotalAmount
FROM
    Purchases AS p
    join Customers AS c ON c.CustomerID = p.CustomerID
    join Categories AS cat ON c.CustomerCategoryID = cat.CustomerCategoryID

GROUP BY c.CustomerID, c.CustomerName,cat.CustomerCategoryName
ORDER BY TotalAmount DESC

The result set return a row for each CustomerID

CustomerID CustomerName CustomerCategoryName TotalAmount
905 Sara Huiting Supermarket 24093.60
155 Tailspin Toys Novelty Shop 23579.50
473 Hilton Hotel 23125.60
143 Jane Doe Journalist 21915.50
518 Wingtip Toys Novelty Shop 20362.40
489 Jason Black Supermarket 20226.40
... ... ... ...

I have 6 categories:

  • Hotel
  • Journalist
  • Novelty Shop
  • Supermarket
  • Computer Store
  • Gift Store

I would like the highest "TotalAmount" for each "CustomerCategoryName", so that only 6 records are returned (instead of 500).

CustomerID CustomerName CustomerCategoryName TotalAmount
905 Sara Huiting Supermarket 24093.60
155 Tailspin Toys Novelty Shop 23579.50
473 Hilton Hotel 23125.60
143 Jane Doe Journalist 21915.50
1018 Nils Kaulins Computer Store 17019.00
866 Jay Bhuiyan Gift Store 14251.50

How to improve my query to get this output?

Lilly_Co
  • 169
  • 12

2 Answers2

1

You can use TOP(1) WITH TIES in combination with an ORDER BY clause on a ROW_NUMBER window function, that will assign ranking = 1 to all the highest "TotalAmount" values for each "CustomerCategoryName".

SELECT TOP(1) WITH TIES 
        c.CustomerID,
        c.CustomerName,
        cat.CustomerCategoryName,
        SUM(p.Quantity*p.UnitPrice) AS TotalAmount
FROM Purchases  p
JOIN Customers  c   ON c.CustomerID = p.CustomerID
JOIN Categories cat ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID, 
         c.CustomerName,
         cat.CustomerCategoryName
ORDER BY ROW_NUMBER() OVER(PARTITION BY cat.CustomerCategoryName 
                           ORDER     BY SUM(p.Quantity*p.UnitPrice) DESC)
lemon
  • 14,875
  • 6
  • 18
  • 38
  • 3
    I think you missed out the `TOP (1) WITH TIES` here. – Thom A Jan 18 '23 at 12:49
  • 1
    You're right, thanks for pointing Larnu. – lemon Jan 18 '23 at 12:51
  • Thanks for this answer but I have an error returned for the last line: Invalid column name 'TotalAmount' – Lilly_Co Jan 18 '23 at 12:56
  • 1
    Thanks for the update !! it worked I have the 6 correct results but I would like the column TotalAmount to be ordered from the highest and not randomly – Lilly_Co Jan 18 '23 at 13:12
  • 1
    If a further ordering on the sums is strictly needed, you need another subquery like `SELECT * FROM () cte ORDER BY TotalAmount`. – lemon Jan 18 '23 at 13:15
  • 1
    Totally nitpicky on my part but `TOP 1` should be `TOP (1)` per the documentation ref: https://learn.microsoft.com/en-us/sql/t-sql/queries/top-transact-sql?view=sql-server-ver16 even though it let's you do it without the parenthesis. I would also suggest `AS p` form for aliases – Mark Schultheiss Jan 18 '23 at 17:47
  • Nice catch on `TOP(1)`, the notation without parentheses looks like being deprecated from SQL Server 2005 onwards. Concerning aliasing in SQL Server, using `AS` goes up to preference as mentioned [here](https://stackoverflow.com/questions/9006477/is-it-better-to-use-column-aliases-with-or-without-the-keyword-as). – lemon Jan 18 '23 at 17:57
0

If you want to do this with just subqueries, and not with a CTE, you can do the following process:

  1. Innermost query - Get all row values
  2. Second query - Assign a row number for each row, partitioned by the CustomerCategoryName and ordered by TotalAmount
  3. Final query only has where the RowRank is 1

You can probably optimize the innermost subquery by putting the RowRank in it, but without access to the table, I'm not entirely sure if the query plan will be any more efficient.

    /*
    Get final values where the RowRank = 1
    */
    SELECT DISTINCT
    final.CustomerID,
    final.CustomerName,
    final.CustomerCategoryName,
    final.TotalAmount
    FROM (
    /*
    Get the individual row rankings by TotalAmount DESC
    */
    SELECT DISTINCT
    data.CustomerID,
    data.CustomerName,
    data.CustomerCategoryName,
    data.TotalAmount,
    ROW_NUMBER() OVER (PARTITION BY data.CustomerCategoryName ORDER BY data.TotalAmount DESC) AS RowRank
    FROM (
    /*
    Get all row values
    */
    SELECT
    c.CustomerID,
    c.CustomerName,
    cat.CustomerCategoryName,
    SUM(p.Quantity * p.UnitPrice) AS TotalAmount
    FROM Purchases AS p
    JOIN Customers AS c
    ON c.CustomerID = p.CustomerID
    JOIN Categories AS cat
    ON c.CustomerCategoryID = cat.CustomerCategoryID

    GROUP BY c.CustomerID,
    c.CustomerName,
    cat.CustomerCategoryName) data) final
    WHERE final.RowRank = 1
    ORDER BY final.TotalAmount DESC
swolfe2
  • 429
  • 1
  • 6
  • 24