-2

I have exhausted all of my options. I cannot understand where I am going wrong with this. Please help me. I can not take much more headbanging. Here is where I am at...

Below is an explanation of my table structure.

locations_management table
contains data linking the locations_management_id, season_id, location_id and other unneeded information together

orders table
contains data about the order including the location_management_id and the order_id as well as other unneeded information

orders_products table
product data linked to the orders by order_id. This table only uses these columns: order_product_id, order_id, product_id, piece_qty

orders_adjustments table
used to track any adjustments to the inv_shipped. This table uses the order_id, product_id, piece_qty columns

Here is where I am at today. The query below pulls data from the tables above.

Basically I am asking for the location_management_id(s) from the locations_management table WHERE season_id = 12 AND location_id = 35. There can be more than one possible location_management_id that fits both the season_id and location_id. I then need to find the orders that match these location_management_id(s). Once the orders are found, I need to use the order_id(s) to find the products associated to them in the orders_products table. This query does exactly that but when I take it a step further to combine/SUM the piece_qty for a total inv_shipped, crazy things happen to the numbers.

SELECT  
    locations_management.season_id,  
    locations_management.location_id,  
    orders.order_id,  
    orders_products.product_id,  
    IFNULL((orders_products.piece_qty), 0) AS inv_shipped,  
    IFNULL((orders_adjustments.piece_qty), 0) AS inv_adjustments  
FROM  
    locations_management  
    JOIN orders USING (location_management_id)  
    LEFT JOIN orders_products USING (order_id)  
    LEFT JOIN orders_adjustments ON (orders_adjustments.order_id = orders_products.order_id) AND (orders_adjustments.product_id = orders_products.product_id)  
WHERE  
    locations_management.season_id = 12 AND locations_management.location_id = 35  
GROUP BY   
    product_id, orders_products.order_id

When I run the query above, this is what I get...

season_id   location_id     order_id    product_id      inv_shipped     inv_adjustments
12          35              2127        1               220             0
12          35              2194        1               160             0
12          35              2127        3               312             0
12          35              2127        4               24              0
12          35              2127        5               180             0
12          35              2194        5               24              0
12          35              2127        7               144             0
12          35              2127        7               24              0

This is exactly what I would expect to get. Multiple order_id's grouped by the product_id and all the data is accurate. So now here becomes the problem. I want to add/SUM the product_id's together when they match and have a combined inv_shipped. So product_id 1 would now total 380 for inv_shipped.

When I take the same query from above and I add SUM to the inv_shipped and inv_adjustments (as seen below), I get this data output below. Notice how some of the values have doubled but also the matching product_id rows are not combined.

    IFNULL(SUM(orders_products.piece_qty), 0) AS inv_shipped,  
    IFNULL(SUM(orders_adjustments.piece_qty), 0) AS inv_adjustments  

season_id   location_id     order_id    product_id      inv_shipped     inv_adjustments
12          35              2127        1               440             0
12          35              2194        1               160             0
12          35              2127        3               624             0
12          35              2127        4               48              0
12          35              2127        5               360             0
12          35              2194        5               24              0
12          35              2127        7               288             0
12          35              2127        7               24              0

If I change the GROUP BY to product_id only, I get the follow data:

    GROUP BY product_id  

season_id   location_id     order_id    product_id      inv_shipped     inv_adjustments
12          35              2127        1               600             0
12          35              2127        3               624             0
12          35              2127        4               48              0
12          35              2127        5               384             0
12          35              2127        7               312             0

Again these inv_shipped totals are not correct. So where am I going wrong?

------------------------------------ Suggestions ------------------------------------

This query below was suggested but the data output for the inv_shipped is not added correctly either.

SELECT 
    locations_management.season_id,
    locations_management.location_id,
    orders.order_id,
    products.product_id,
    products.inv_shipped 
FROM
    locations_management
    JOIN (SELECT location_management_id, order_id FROM orders group by order_id) AS orders ON orders.location_management_id = locations_management.location_management_id
    JOIN (SELECT order_id, product_id, IFNULL(SUM(piece_qty), 0) AS inv_shipped FROM orders_products GROUP BY order_id, product_id) AS products ON products.order_id = orders.order_id

WHERE
    locations_management.season_id = 12 AND locations_management.location_id = 35
ORDER BY 
    product_id, order_id  
    
season_id   location_id     order_id    product_id      inv_shipped     inv_adjustments
12          35              2127        1               440             0
12          35              2194        1               160             0
12          35              2127        3               624             0
12          35              2127        4               48              0
12          35              2127        5               360             0
12          35              2194        5               24              0
12          35              2127        7               288             0
12          35              2127        7               24              0
DavidK
  • 5
  • 3
  • 2
    Aggregate (calculate SUM per product_id) in `products` subquery, not in outer query. – Akina Dec 24 '20 at 05:04
  • Hello and thank you, I have tried this change. I get the same results. See my comments below to tcadidot0's post. I go into more details about the issue and what I can see when I troubleshoot it. – DavidK Dec 24 '20 at 13:11
  • Please in code questions give a [mre]--cut & paste & runnable code, including smallest representative example input as code; desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. For SQL that includes DBMS & DDL (including constraints & indexes) & input as code formatted as a table. [ask] Chop to the 1st subexpression with unexpected result & explain what you expect & why, with justification referencing documentation. – philipxy Dec 25 '20 at 04:14
  • It's very good that you (almost) give working code before extending to problem code, but first use select \* after your 2 joins to see what you're aggregating on. The 1st comment here is correct. This has a common error where people want some joins, each possibly involving a different key, of some subqueries, each possibly involving join and/or aggregation, but they erroneously try to do all the joining then all the aggregating or to aggregate over previous aggregations. Write separate sums over appropriate rows and/or sum a case statement picking rows; join on common unique column sets. – philipxy Dec 25 '20 at 04:18
  • When giving a relation(ship)/association or table (base or query result), say what a row in it states about the business situation in terms of its column values. Constraints aren't needed to query, table meanings (per the preceding sentence) are sufficient & necessary. But if a query relies on some constraints, give them. When you get an unexpected/"wrong" result, pause work on your overall goal & debug your misconception(s). PS A "basically" or "essentially" or "in other words" that doesn't introduce or summarize a clear, precise & full description that is also given just means "unclearly". – philipxy Dec 25 '20 at 04:32
  • If you want a select with aggregation of the query that works then why not make the query that works a subquery & select & aggregate it? (You don't clearly say or give an example of what you want & you don't give your wrong code trying to get it.) (Presumably in your wrong code that you don't give you add aggregation to the above select list & you think you're aggregating over the query that works but that would aggregate over all the rows from the joins, ie the rows you get from replacing the current correct query's selected columns by a \*.) (Guesses while awaiting a [mre].) – philipxy Dec 25 '20 at 04:51

2 Answers2

1

First, find the root of the issue. What changes in between the correct and incorrect information?

Let's take a look at your second query. From what I can see, there are three things that changes:

  1. You've joined with another sub-query.
  2. You've added a SUM operation.
  3. You've added a GROUP BY.

Nest step is to try with removing SUM and GROUP BY, like this:

SELECT 
    locations_management.season_id AS season_id,
    locations_management.location_id AS location_id,
    orders.order_id AS order_id,
    products.product_id AS product_id,
    products.piece_qty
FROM
    locations_management
    JOIN (SELECT location_management_id, order_id FROM orders group by order_id) AS orders ON orders.location_management_id = locations_management.location_management_id
    JOIN (SELECT order_id, product_id, piece_qty FROM orders_products) AS products ON products.order_id = orders.order_id

WHERE
    locations_management.season_id = 12 AND locations_management.location_id = 35

I assume that each product_id will return two (or more) rows. That's probably because of your second JOIN have two (or more) rows for order_id in orders_products table; it seems obvious because the first sub-query for table orders have group by order_id. So, now to quickly fix this, you need to do the SUM inside the second sub-query instead. Something like this:

SELECT 
    locations_management.season_id AS season_id,
    locations_management.location_id AS location_id,
    orders.order_id AS order_id,
    products.product_id AS product_id,
    products.inv_shipped 
FROM
    locations_management
    JOIN (SELECT location_management_id, order_id FROM orders group by order_id) AS orders ON orders.location_management_id = locations_management.location_management_id
    JOIN (SELECT order_id, product_id, IFNULL(SUM(products.piece_qty), 0) AS inv_shipped FROM orders_products GROUP BY order_id, product_id) AS products ON products.order_id = orders.order_id

WHERE
    locations_management.season_id = 12 AND locations_management.location_id = 35;

This might return you the correct result however I personally will write the query like this:

SELECT lm.season_id, lm.location_id, o.order_id , p.product_id, p.inv_shipped 
FROM locations_management AS lm
JOIN (SELECT location_management_id, order_id 
      FROM orders 
      GROUP BY location_management_id,order_id) AS o 
  ON o.location_management_id = lm.location_management_id
JOIN (SELECT order_id, product_id, IFNULL(SUM(products.piece_qty), 0) AS inv_shipped 
      FROM orders_products 
      GROUP BY order_id, product_id) AS p 
  ON p.order_id = o.order_id
WHERE
    lm.season_id = 12 AND lm.location_id = 35;
  1. You don't need to set alias if your alias is the same as column name; for example lm.season_id AS season_id. If you remove .. AS season_id, the column will be recognized as season_id nonetheless. You won't see it as lm.season_id.. well at least for most tools that I know of. Also, I personally think aliases are meant to shorten long table or column names but "to each their own".
  2. GROUP BY should include all the non-aggregated column(s) in SELECT. Of course, if the sql_mode=only_full_group_by is turned off, you can run the query but the correct setting should be ON. You can read more of the reason why here.
  3. With the additional columns added in GROUP BY, this query might not return the result you once had. That's depending on you data and if that happen, I suggest you edit your question and add a Minimal, reproducible example. At the moment, we're only seeing queries and no example table/data to work with. Its better if you can create a fiddle with a few rows of data.
FanoFN
  • 6,815
  • 2
  • 13
  • 33
  • I greatly appreciate it you taking time to help me! "What changes in between the correct and incorrect information?" So when I use the modified query with both JOINs, the data that is output is exactly how I want it. When two ore more orders exists, those orders are grouped by their matching product_id and SUM'd together. The first output from above shows how I am joining the locations table to the order table to get order_ids based on the season_id AND location_id. Then I added the second join to pull all products from the orders_products table that have matching order_ids. – DavidK Dec 24 '20 at 12:49
  • When I query season_id AND location_id and that data has orders with product data, the inv_shipped is SUM'd correctly. From what I can troubleshoot, this incorrect SUM'ing only happens when there is an order (order_id) that does not have an order (orders_products) actually assigned yet. The order exists in the orders table but no products have been assigned yet. On the other hand, when I query a season_id AND location_id with products for each order_id, the combined SUM for each product_id is correct. Both of your modified queries returns the same results I get. – DavidK Dec 24 '20 at 12:50
  • So what you're saying, inside of table `order_products`, there might be repeated `order_id`, without `product_id` assigned to it but the value of `piece_qty` present? If so, you maybe can change the sum syntax to `IFNULL(SUM(CASE WHEN product_id IS NULL THEN 0 ELSE piece_qty END), 0) `.. (change `IS NULL` depending on the default unassigned value of `product_id`). See if that would work – FanoFN Dec 25 '20 at 06:40
0

I don't see the need for those nested subqueries. And if you want one row per order and product, then aggregation may not be necessary.

You seem to want Doesn't this do what you want?

select lm.season_id, lm.location_id, 
    op.order_id, op.product_id, op.piece_qty as inv_shipped
from locations_management lm
inner join orders o on o.location_management_id = lm.location_management_id 
inner join order_products op on op.order_id = o.order_id
where lm.season_id = 12 and lm.location_id = 35

Or if you want one row per product:

select lm.season_id, lm.location_id, 
    op.product_id, coalesce(sum(op.piece_qty), 0) as inv_shipped
from locations_management lm
inner join orders o on o.location_management_id = lm.location_management_id 
left join order_products op on op.order_id = o.order_id
where lm.season_id = 12 and lm.location_id = 35
group by lm.season_id, lm.location_id, op.product_id
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thank you for taking time to help me I agree there should not be a need for them but as I have googled for the last few days on why SUM can incorrectly output totals, doing straight JOINs are the most common problem. The above query was just my latest of many iterations. Please see my comments above – DavidK Dec 24 '20 at 13:00
  • Your first query idea outputs all rows of products assigned to each order_id. It does not combine/SUM them together based on their product_id. The second query does exactly what my original example above does. When all orders have products, inv_shipped is SUM'c correctly. If one order exists without products assigned yet, inv_shipped is added to itself based on how many orders exists. – DavidK Dec 24 '20 at 13:05
  • @DavidK: I don't know what you mean. The second query filters out orders that have no product. You can retain them if you like, using a `left join`. I changed that in the code. – GMB Dec 24 '20 at 16:02
  • Hello GMB. I completely edited my original post above to give more details about the tables and how the data is being used. Please have a look and maybe this will make things more understandable. – DavidK Dec 25 '20 at 03:12
  • 1
    @DavidK: please show the results that you want as well. – GMB Dec 25 '20 at 10:16