3

I'm kind of stumped on this one. What I'd like to do is get all Orders that have the exact identical items and quantities to a given Order.

For example, Order number 100 has 1 Blue Widget, 2 Pink Widgets, and 1 Green Widget. I want to select all Orders from the database that also have exactly 1 Blue Widget, 2 Pink Widgets, and 1 Green Widget, and no items more or less than that.

The goal is to answer the question, "the last time this exact combination of items was shipped, what box did we ship it in?". If I can get orders that are an exact match in terms of items, I can see what box was used before.

My Table structure is:

Orders_Items

+---------------------------+-----------+-------------+
| OrderID   | OrderPartID   |   ItemID  | QtyOrdered  |
+---------------------------+-----------+-------------+
| 1         |   21          |      4    | 11.5000     |
| 2         |    8          |      5    |  4.3333     |
+---------------------------+-----------+-------------+


Orders_OrderParts

+--------------+------------+-----------+
| OrderPartID  | OrderID    |  StatusID |
+--------------+------------+-----------+
| 1            |   21       |    4      |
| 2            |    8       |    5      |
+---------------------------+-----------+


Orders_OrderParts_2_Shipments

+--------------+------------+--------------+
| OrderPartID  | OrderID    |  ShipmentID  |
+--------------+------------+--------------+
| 1            |   21       |      4       |
| 2            |    8       |      5       |
+---------------------------+--------------+

The "ShipmentID" is the magic number we're trying to find since it will tell me which boxes were used in the past.

I've started with this as a basis.

SELECT *, COUNT(*) AS ItemsInOrder
FROM Orders_Items OI
LEFT JOIN Orders_OrderParts OP
   ON OP.OrderPartID = OI.OrderPartID
LEFT JOIN Orders_OrderParts_2_Shipments OP2S
   ON OP2S.OrderPartID = OP.OrderPartID
WHERE OP.StatusID=800 # Only select Orders that have shipped.
GROUP BY OI.OrderPartID

This is a working query, but for now, it only returns all items grouped by order part. I need to remove all the order parts that don't contain the items in the current order part.

The Query:

SELECT * FROM `Orders_Items` WHERE OrderPartID = 100

returns all of the items in the current order part and their quantities, ie:

+--------------+------------+--------------+
| OrderPartID  |   ItemID   |  Quantity    |
+--------------+------------+--------------+
| 100          |     21     |      1       |
| 100          |     8      |      3       |
+---------------------------+--------------+

So I need to narrow the first query by showing only order parts that have exactly 1 of Item #21, and exactly 3 of Item #8.

And that's where I'm stuck. I could do the two queries separately and use PHP to loop over the results and compare them, but I'm curious if it would be more efficient to use a pure MySQL solution... or if a single query is even possible in this situation?

Nick
  • 10,904
  • 10
  • 49
  • 78
  • See this similar question: http://stackoverflow.com/questions/7364969/how-to-filter-sql-results-in-a-has-many-through-relation –  Nov 23 '11 at 10:14
  • I'm not sure that other question is similar enough. The OP is asking to find overlapping students, given two clubs. I'm trying to find other "clubs" with an identical *group* of "students" and quantities of that "student" based on the members of one known "club". – Nick Nov 23 '11 at 16:48
  • Are you allowed to 'cheat' and create some sort of "fingerprint" column for the orders as they are created? It's not great SQL, but might solve the problem for now. – Cylindric Nov 24 '11 at 18:00
  • This is a relational division problem. See the [Exact Division](http://www.simple-talk.com/sql/t-sql-programming/divided-we-stand-the-sql-of-relational-division/) section here. Though as you are on MySQL I might also experiment with using `GROUP_CONCAT` to get a delimited list denoting an order and joining on that. – Martin Smith Nov 25 '11 at 11:10
  • Where did that `100` appear from? You have no `100` in your tables. You better show us a sample of the data and a **result-set from that same data**. – ypercubeᵀᴹ Nov 25 '11 at 13:39

1 Answers1

3

I think I have it. I'll explain the process and test setup I came up with first...


Setup

I created three tables, one for orders, one for items, and one to join the two:

CREATE TABLE Items (
    ItemId int NOT NULL,
    ItemName nvarchar(50) NULL
);

CREATE TABLE Orders (
    OrderId int NOT NULL,
    OrderName nvarchar(50) NULL,
    BoxType nvarchar(50) NULL
);

CREATE TABLE OrdersItems(
    OrderId int NOT NULL,
    ItemId int NOT NULL,
    Quantity int  NOT NULL
);

I inserted some test orders and items:

INSERT INTO Items (ItemId, ItemName) VALUES (1, 'Apples'),(2, 'Oranges'),(3, 'Pears');
INSERT INTO Orders (OrderId, OrderName, BoxType) VALUES
    (1, 'Order1', 'small'),
    (2, 'Order2', 'medium'),
    (3, 'Order3', 'small'),
    (4, 'Order4', 'large'),
    (5, 'Order5', 'small');
INSERT INTO OrdersItems (OrderId, ItemId, Quantity) VALUES (1, 1, 10),(1, 2, 5);
INSERT INTO OrdersItems (OrderId, ItemId, Quantity) VALUES (2, 1, 5),(2, 2, 7),(2, 3, 2);
INSERT INTO OrdersItems (OrderId, ItemId, Quantity) VALUES (3, 1, 10),(3, 2, 5);
INSERT INTO OrdersItems (OrderId, ItemId, Quantity) VALUES (4, 1, 10),(4, 2, 5),(4, 3, 2);
INSERT INTO OrdersItems (OrderId, ItemId, Quantity) VALUES (5, 1, 4),(5, 3, 1);

This gives this set of orders:

All orders


Order Summary

So what this all does is create 5 orders:

  1. 10 apples and 5 oranges (The target order)
  2. 5 apples, 7 oranges and 2 pears (has apples and oranges, but wrong amount and also pears)
  3. 10 apples and 5 oranges (The matching order)
  4. 10 apples, 5 oranges and 2 pears (correct apples and oranges, but also pears)
  5. 4 apples and 1 pear (no oranges)

Getting the data

First we get a list of all orders that have at least the same items as our target order. So in our case it must have 10 apples and 5 oranges, but can also have extra order items:

SELECT DISTINCT O2.OrderName
FROM Orders O1
LEFT JOIN OrdersItems OI1 ON (O1.OrderId=OI1.OrderId)
LEFT JOIN Items I1 ON (OI1.ItemId=I1.ItemId)
LEFT JOIN OrdersItems OI2 ON (OI1.ItemId=OI2.ItemId AND OI1.OrderId<>OI2.OrderId AND OI1.Quantity=OI2.Quantity)
LEFT JOIN Orders O2 ON (OI2.OrderId=O2.OrderId)
WHERE O1.OrderId=1;

Same or with extra

Next we get a list of orders that have items that our target order does not have:

SELECT DISTINCT O1.OrderName
FROM Orders O1
LEFT JOIN OrdersItems OI1 ON (O1.OrderId=OI1.OrderId)
CROSS JOIN Orders O2
LEFT JOIN OrdersItems OI2 ON (O2.OrderId=OI2.OrderId AND OI1.ItemId=OI2.ItemId and OI1.Quantity=OI2.Quantity)
WHERE O1.OrderId<>1
AND O2.OrderId=1
AND OI2.ItemId IS NULL;

extras

Now combine the two to get only orders that are the same!

SELECT DISTINCT O2.OrderName, O2.BoxType
FROM Orders O1
LEFT JOIN OrdersItems OI1 ON (O1.OrderId=OI1.OrderId)
LEFT JOIN Items I1 ON (OI1.ItemId=I1.ItemId)
LEFT JOIN OrdersItems OI2 ON (OI1.ItemId=OI2.ItemId AND OI1.OrderId<>OI2.OrderId AND OI1.Quantity=OI2.Quantity)
LEFT JOIN Orders O2 ON (OI2.OrderId=O2.OrderId)
WHERE O1.OrderId=1
AND O2.OrderId NOT IN
(
    SELECT DISTINCT O1.OrderId
    FROM Orders O1
    LEFT JOIN OrdersItems OI1 ON (O1.OrderId=OI1.OrderId)
    CROSS JOIN Orders O2
    LEFT JOIN OrdersItems OI2 ON (O2.OrderId=OI2.OrderId AND OI1.ItemId=OI2.ItemId and OI1.Quantity=OI2.Quantity)
    WHERE O1.OrderId<>1
    AND O2.OrderId=1
    AND OI2.ItemId IS NULL
);

The order

Now, you might have problems if there are multiple orders the same, but used different boxes, in which case you'll have to either show them all, or pick the one with be biggest COUNT(OrderId) or whatever.


The full script I used to test this (works both with MSSQL and MySQL) can be found here, and it has a couple of extra orders for testing too.

Cylindric
  • 5,858
  • 5
  • 46
  • 68