0

I'm trying to use union and group by with aggregate functions, but all attempts either result in missing rows or incorrect data. Below are two queries that produce the data I need, now i just need to unite them

Here is a sample SQL Dump on pastebin

SQL Query #1:

SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst
FROM orders as o, orders_product_line as opl, products as p
WHERE opl.order_id = o.id
AND p.id = opl.product_id
GROUP BY o.id

Result for Query #1

+----------+------------+------------+
| order_id | total_prod | total_inst |
+----------+------------+------------+
|        1 |       4200 |          0 |
|        2 |      40000 |          0 |
|        3 |    3600000 |          0 |
|        4 |      44500 |          0 |
|        5 |    1229800 |          0 |
|        6 |   45000000 |          0 |
+----------+------------+------------+

SQL Query #2:

SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst
FROM orders as o, orders_installation_line as oil, installations as i
WHERE oil.order_id = o.id
AND i.id = oil.intallation_id
GROUP BY order_id

Result for Query #2:

+----------+------------+------------+
| order_id | total_prod | total_inst |
+----------+------------+------------+
|        1 |          0 |       4675 |
|        2 |          0 |     255000 |
|        3 |          0 |      18880 |
|        4 |          0 |        600 |
|        5 |          0 |       3540 |
+----------+------------+------------+

Here is my attempt at using union for the two tables

SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id GROUP BY o.id
UNION ALL
SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id GROUP BY order_id

And the result

+----------+------------+------------+
| order_id | total_prod | total_inst |
+----------+------------+------------+
|        1 |       4200 |          0 |
|        2 |      40000 |          0 |
|        3 |    3600000 |          0 |
|        4 |      44500 |          0 |
|        5 |    1229800 |          0 |
|        6 |   45000000 |          0 |
|        1 |          0 |       4675 |
|        2 |          0 |     255000 |
|        3 |          0 |      18880 |
|        4 |          0 |        600 |
|        5 |          0 |       3540 |
+----------+------------+------------+

Lastly, Here is my attempt at using union after reading other answers on stack overflow:

SELECT *
FROM 
(
SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst
FROM orders as o, orders_product_line as opl, products as p
WHERE opl.order_id = o.id
AND p.id = opl.product_id
GROUP BY o.id

UNION ALL

SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst
FROM orders as o, orders_installation_line as oil, installations as i
WHERE oil.order_id = o.id
AND i.id = oil.intallation_id
GROUP BY order_id
) Q
GROUP BY Q.order_id

And the result of this last union:

+----------+------------+------------+
| order_id | total_prod | total_inst |
+----------+------------+------------+
|        1 |       4200 |          0 |
|        2 |      40000 |          0 |
|        3 |    3600000 |          0 |
|        4 |      44500 |          0 |
|        5 |    1229800 |          0 |
|        6 |   45000000 |          0 |
+----------+------------+------------+

What am I missing? I need that last column total_inst to show values. This is the result I am looking for:

+----------+------------+------------+
| order_id | total_prod | total_inst |
+----------+------------+------------+
|        1 |       4200 |       4675 |
|        2 |      40000 |     255000 |
|        3 |    3600000 |      18880 |
|        4 |      44500 |        600 |
|        5 |    1229800 |       3540 |
|        6 |   45000000 |          0 |
+----------+------------+------------+
cborgia
  • 1,399
  • 1
  • 10
  • 10
  • Can you show what would be your desired result? On the first query you set the `total_inst` as 0. Edit your post with your tables structure and, the attempts you made plus your desired result. It will be a lot easier to help you. – Jorge Campos Dec 10 '13 at 02:26
  • Edited to show the Results I'm after. @JorgeCampos, `total_inst` (total installation revenue) is 0 in sql 1 and `total_prod` (total product revenue) is 0 in sql 2 because orders can have products and/or installations, and I want the results to be able to display 0 for `total_inst` and 0 `total_prod` as needed. – cborgia Dec 10 '13 at 02:54

2 Answers2

0

Your Group By's are right… but I don't think you want a union. Try this instead:

select q1.order_id, q1.total_prod, q2.total_inst from 
(SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id GROUP BY o.id) q1, 
(SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id GROUP BY order_id) q2
where q1.order_id = q2.order_id

EDIT: The above query will only retrieve order_ids which are returned by BOTH inner queries. To include order_id 6, you would need a LEFT JOIN. However, that would not get records that might exist in the results from q2 but not in the results from q1 (the opposite case). A FULL JOIN would do this, but my understanding is that these are not implemented in MySQL. They can be emulated though, as explained here: Full Outer Join in MySQL

select q1.order_id, q1.total_prod, q2.total_inst from 
(SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id GROUP BY o.id) q1
LEFT JOIN
(SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id GROUP BY order_id) q2
ON q1.order_id = q2.order_id
UNION
select q1.order_id, q1.total_prod, q2.total_inst from 
(SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id GROUP BY o.id) q1 
RIGHT JOIN 
(SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id GROUP BY order_id) q2
ON q1.order_id = q2.order_id

FINAL EDIT: Here's is a working SQLFiddle of another approach. http://sqlfiddle.com/#!2/deff12/12/0

Community
  • 1
  • 1
Matt Welch
  • 670
  • 4
  • 11
  • this ALSMOT produces the result i am looking for, but fails to show orders that have a 0 in the `total_prod` or `total_inst` columns. For example, with this sql, order_id 6 does not get returned (see edited question for the desired result) – cborgia Dec 10 '13 at 02:58
  • Ahhh then you need an outer join, rather than the default inner join produced by the where clause "where q1.order_id = q2.order_id." I'll update the above answer. – Matt Welch Dec 10 '13 at 03:12
  • Matt, your edited query does produce the desired results, but it does not seem ideal (perhaps not the most optimal route). I will keep having a play with it, if no one else chimes in with a working query that is shorter, I will mark your answer as correct. – cborgia Dec 10 '13 at 03:42
  • There is another way to do it (there's almost always another way), but I don't know off the top of my head if it would perform better or worse. I'll update again. – Matt Welch Dec 10 '13 at 03:57
  • Thanks for your continued efforts matt, I aded a link to a sql dump of the schema and sample data. Your most recent edit produces the same results as your first (missing row if 0 or null) – cborgia Dec 10 '13 at 04:56
  • Ahhh good call. Forgot the JOINS that were the original issue. :) Edited again… hopefully for the last time! – Matt Welch Dec 10 '13 at 05:15
0

I agree with Matt, what you need here is a join/cartesian product and not a union. Try:

SELECT * FROM 
(SELECT o.id as order_id, SUM(price * quantity) as total_prod, FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id GROUP BY o.id) as q1 NATURAL JOIN
(SELECT o.id as order_id, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id GROUP BY order_id) as q2;

1st EDIT: Okay. So basically, if id is in orders_product_line but not orders_installation_line then we need to insert a tuple with the id, the value of total_prod and a value of 0 for total_inst. Is that correct? So try something like this:

(SELECT * FROM 
(SELECT o.id as order_id, SUM(price * quantity) as total_prod, FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id GROUP BY o.id) as q1 NATURAL JOIN
(SELECT o.id as order_id, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id GROUP BY order_id) as q2)
UNION 
(SELECT o.id as order_id, SUM(price * quantity) as total_prod, 0 as total_inst FROM orders as o, orders_product_line as opl, products as p WHERE opl.order_id = o.id AND p.id = opl.product_id AND opl.order_id NOT IN (SELECT id FROM orders_installation_line) GROUP BY o.id)
UNION
(SELECT o.id as order_id, 0 as total_prod, SUM(rate * hours) as total_inst FROM orders as o,  orders_installation_line as oil, installations as i WHERE oil.order_id = o.id AND i.id = oil.intallation_id AND oil.order_id NOT IN (SELECT id FROM orders_product_line) GROUP BY order_id);

How does that work? It's probably very suboptimal, but hopefully it gets the result.

James
  • 140
  • 4
  • This too ALMOST produces the result I'm looking for, same problem as Matt's first query (missing rows when 0 value in total_prod/total_inst) – cborgia Dec 10 '13 at 03:56