0

I have a table 'order' containing summary order information and a table 'orderItem' with each item of the order.

My issue is: when selecting the Sum of column 'orderQTY' from the 'Order' table, I get an incorrect total if i join the orderItem table.

The first query below gives me the correct total. However, as soon as I add the join to orderItem, the sum result is incorrectly duplicating the 'orderqty' column for each orderitem record.

nb: I know the below doesn't utilise the join and isn't necessary. I've removed the clauses referring to the joined table to simplify the question.

--RETURNS Correct value
select sum(o.orderqty) 
from [order] o 

--RETURNS the sum containing duplicates of o.orderqty
select sum(o.orderqty)
from [order] o  
join OrderItem oi on o.Id = oi.OrderId

-- adding clarification: ----
I am wanting to Sum the column 'orderqty' from table 'order' while joining to orderItem eg:

There would be multiple orderItems for each Order, but I only want to obviously only count the orderqty from the order table once per order.

select sum(o.ordertotal)
from [order] o with(NOLOCK)
join OrderItem oi on o.Id = oi.OrderId
where oi.mycolumn = 1

or would i need to do something like:

select sum(o.ordertotal)
from [order] o with(NOLOCK)
where o.id in (select orderid from orderitem where x = y)
art
  • 13
  • 1
  • 3
  • 2
    yes, it is expected behavior, use 1st query – Iłya Bursov Jan 24 '15 at 00:14
  • 1
    what are you trying to do? get information from the order line and orders tables in one query? – Beth Jan 24 '15 at 00:17
  • What are using the orderqty for in the query? What rows are you asking for by the query using it? Can you just use "o.orderqty"? If you don't know how to write a query for what you want then you have to describe what you want in English and/or give examples. – philipxy Jan 24 '15 at 04:20
  • Example: OrderTotal is a column in the order table that contains the total of an order. I want to query to sum of that column in all orders that contain a certain orderitem. However order has a 1 to many to relation to orderitem. So I understand that I am getting the orderTotal in the order table for each order item. I am asking how to avoid this. – art Jan 24 '15 at 04:31
  • *Please* give an example with input and output tables and put it all in your question with one explanation. PS: Are orderqty, ordertotal and OrderTotal all the same? – philipxy Jan 24 '15 at 04:35

4 Answers4

4

It returns different results because the join multiplies the number of rows or filters out rows, both of which affect the sum. It is unclear what you really want to do. If you only want the sum of the quantities of orders that have orderlines, then use exists:

select sum(o.orderqty)
from [order] o  
where exists (select 1
              from OrderItem oi 
              where o.Id = oi.OrderId
             );
Bohemian
  • 412,405
  • 93
  • 575
  • 722
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • @RonSmith . . . If someone wants to affect my commenting behavior, they would do much better to say what the issue is than to anonymously downvote. – Gordon Linoff Jan 24 '15 at 00:40
  • Thanks for the help, i've clarified the question (I hope) above. I understand that the order record is being returned for each order, and each orderitem (of which there will be many. I don't understand how to get around the issue without removing the join? – art Jan 24 '15 at 04:05
  • have adapted your response which is now returning the correct value. which was very helpful given my obviously poor description of the problem. – art Jan 24 '15 at 04:36
1

You can sum once per grouping (in this case order.id) using the row_number function:

select sum(orderqty)
from (
    select
        case row_number() over(partition by o.Id order by o.id)
            when 1 then orderqty
            else 0
        end as orderqty
    from [order] o  
        join OrderItem oi on o.Id = oi.OrderId
    ) o
Ron Smith
  • 3,241
  • 1
  • 13
  • 16
0

Joins work to create a result set that combines data from multiple tables. SQL then operates aggregate functions and the where clause over this combined result set. So imagine you have 2 orders, 1 with a single item and 1 with 2 items:

// Order table
OrderId   OrderQty
1         5
2         3

// OrderItem table
OrderId   ItemId
1         1
2         1
2         2

Your join will give you this result:

OrderId   OrderQty   OrderId   ItemId
1         5          1         1
2         3          2         1
2         3          2         2

So when you SUM(OrderQty), you get 11 instead of 8 which I think you want. Linking to the other table doesn't make sense in this case...

Jason Goemaat
  • 28,692
  • 15
  • 86
  • 113
0

A FROM clause (acts as if it) calculates an intermediate table.

The SELECT clause expression

sum(o.orderqty)

does not mean "the sum of the orderqty column of table o". It means "the sum of the o.orderqty column of the intermediate table generated by the FROM clause".

Here are example tables:

// [order]
Id  orderqty
 1   33
 2   66

// OrderItem
OrderId  ItemId
 2        7
 1        8
 1        9

Here is the second query:

select sum(o.orderqty)
from [order] o  
join OrderItem oi on o.Id = oi.OrderId

First a FROM calculates a "cross product" that has every possible combination of a row from the tables you JOIN. So each row of it has a subrow from each table. Each subrow has column names like its source table column names but with its source table name or alias prefixed with a dot. (If there's no ambiguity you don't need to mention the prefix and dot to refer to a FROM clause table column.)

// [order] o join OrderItem oi
o.Id  o.orderqty  oi.OrderId  oi.ItemId
  1     33           2           7
  1     33           1           8
  1     33           1           9
  2     66           2           7
  2     66           1           8
  2     66           1           9

Then the rows that fail the ONs and WHERE are removed:

// [order] o join OrderItem oi on o.Id = oi.OrderId
o.Id  o.orderqty  oi.OrderId  oi.ItemId
  1     33           1           8
  1     33           1           9
  2     66           2           7

That's the second query's FROM intermediate table that its SELECT uses. Here's the first query's FROM intermediate table that its SELECT uses:

// [order] o
o.Id  o.orderqty
  1     33
  2     66

So the second query's SUM(o.orderqty) is 33+33+66 while the first query's is 33+66. Since you want SUM for a FROM clause table that is [order], you want the first query.

(We can also describe this on a join-by-join basis, which allows us to explain how LEFT JOIN differs from INNER JOIN, CROSS JOIN and "," and how USING and NATURAL make a difference.)

Re update: You have not explained what rows you want back from your query. Maybe you want:

select oi.Id, oi.orderId, o.orderqty, ...
from [order] o  
join OrderItem oi on o.Id = oi.OrderId

or

select ..., (select sum(orderqty) from [order]) as totalqty, ...
Community
  • 1
  • 1
philipxy
  • 14,867
  • 6
  • 39
  • 83