3

I have two tables, a customers and orders table.

The customers table contains a unique ID for each customer. It contains 1141 entries.

The orders table contains many entries with a customerID and a date.

I am trying to query my database and return a list of customers and the max(date) from the orders list.

SELECT *
FROM customers
INNER JOIN
(
    SELECT CustomerID, max(date) as date
    FROM orders
    GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.date = Sub1.Date

However this query is returning 1726 rows instead of 1141 rows. Where is this getting extra from?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Kenta
  • 369
  • 1
  • 10
  • 30

3 Answers3

1

I think it's beacause ORDERS table contains same customerID multiple times, so when you join the table with CUSTOMERS, each CUSTOMER.id matches multiple rows of ORDERS.

ollaw
  • 2,086
  • 1
  • 20
  • 33
0
;with cte as
(
  select CustomerID, orderdate
      , rn = row_number() over (partition by customerID order by orderdate desc)
from orders
)

select c.*, cte.orderdate
from customer c
 join cte on cte.customerID = c.customerid 
where rn =1 -- This will limit to latest orderdate
KeithL
  • 5,348
  • 3
  • 19
  • 25
0

The problem is that there are ties.

For a given customer, some place more than one order per day. So there's a possibility that occasionally some may have placed more than one order on the date that is their max date.

To fix this, you need to use MAX() or some column that is always unique in the Orders table (or at least unique within a given date). This is easy if you can depend on an auto-increment primary key in the Orders table:

SELECT *
FROM customers
INNER JOIN
(
    SELECT CustomerID, max(orderid) as orderid as date
    FROM orders
    GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.orderid = Sub1.orderid

This assumes that orderid increases in lock-step with increasing dates. That is, you'll never have an order with a greater auto-inc id but an earlier date. That might happen if you allow data to be entered out of chronological order, e.g. back-dating orders.

Strawberry
  • 33,750
  • 13
  • 40
  • 57
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • So essentially, this selects the max date from the max orderid from each customer? – Kenta Feb 02 '17 at 14:48
  • It doesn't select dates, it selects orderids. If orderids are assigned in chronological order, then it naturally corresponds with the max date for each customer. – Bill Karwin Feb 02 '17 at 16:15