0

this is an answer to the question : We need a list of customer IDs with the total amount they have ordered. Write a SQL statement to return customer ID (cust_id in the Orders table) and total_ordered using a subquery to return the total of orders for each customer. Sort the results by amount spent from greatest to the least. Hint: you’ve used the SUM() to calculate order totals previously.

SELECT prod_name,
       (SELECT Sum(quantity)
        FROM OrderItems
        WHERE Products.prod_id=OrderItems.prod_id) AS quant_sold
FROM Products

;

So there is this simple code up here, and I know that this WHERE clause is comparing two columns in two different tables. But since We are calculating the SUM of that quantity, why do need that WHERE clause exactly. I really couldn't get it. Why the product_id exactly and not any other column ( p.s: the only shared column between those two tables is prod_id column ) I am still a beginner. Thank you!

Jens
  • 67,715
  • 15
  • 98
  • 113
  • It's called a scalar subquery. First it can only return a single value. Secondly it's "correlated" on product I'd from the order query. You can think of it as running once per output row and thereby computing the total quantity sold for each product. Without filtering every row would be the same total of the entire table for all orders for and product. – shawnt00 Dec 17 '21 at 16:46
  • This sounds like some kind of classroom exercise because this problem is simply and naturally solved with a Left Outer Join between the two tables, but someone imposed an artificial stricture asking you to use a subquery and thereby creating a less efficient solution. A more efficient version would be to move a modified version of the Orders table subquery down into the From clause and join Product to it, but still less efficient than a simple join. – Chris Maurer Dec 17 '21 at 17:19
  • @ChrisMaurer Have you ever looked at the execution plan between the two? I once thought the same; but as with many cases: it depends (https://stackoverflow.com/questions/2577174/join-vs-sub-query) Note this is but one example; more can be found. – xQbert Dec 17 '21 at 17:56

2 Answers2

0

First you would want to know the sum for each product - so need to adjust the subquery similar to this:

(SELECT prod_id, Sum(quantity) qty
 FROM OrderItems
 group by prod_id   
) AS quant_sold

then once you know how much for each product, then you can link that

SELECT prod_name,
    (SELECT prod_id, Sum(quantity) qty
     FROM OrderItems
     group by prod_id   
    ) AS quant_sold
FROM Products p
WHERE p.prod_id = quant_sold.prod_id
Randy
  • 16,480
  • 1
  • 37
  • 55
0
  • Run it without the where clause and compare the results. You'll learn a lot that way. specifically focus on two different product Ids ensuring they both have order items and quantities.

You have two different tables involved. There are multiple products. You don't want the sum of all orders on each product; which is what you would get without the where clause. So the where clause correlates the two tables ensuring you only SUM the quantity of each order item for each product between the tables. Personally, I'd use a join, sum, and a group by as I find it easier to read and I'm not a fan of sub selects in the select of another query; but that's me.

SELECT prod_name,
       (SELECT Sum(quantity)
        FROM OrderItems
        WHERE Products.prod_id=OrderItems.prod_id) AS quant_sold
FROM Products

Should be the same as:

SELECT prod_name, Sum(coalesce(P.quantity,0))
FROM Products P
LEFT JOIN orderItems OI
 on P.prod_id=OI.prod_id
GROUP BY Prod_Name

'Notes

  • the above is untested.
  • a left join is needed because all products should be listed and if a product doesn't have an order, the quantity would be zero.
  • if we use an inner join, the product would be excluded.
  • We use coalesce because you'd have a "Null" quantity instead of zero for such lines without an order item.
  • as to which is "right" well it depends and varies on different cases. each has it's own merits and in different cases, one will perform better than another, and in a different case, vice-versa. See --> Join vs. sub-query

As an example:

  • Say you have Products A & B
  • "A" has Order Item Quantities of 1 & 2
  • "B" has order item Quantities of 10 & 20
  • If we don't have the where clause every result record would have qty 33
  • If we have the where product "A" would have 3
  • product "B" would have qty 30.
xQbert
  • 34,733
  • 2
  • 41
  • 62