0

I have the following tables:

Customers

ID     Name 
============
1      John 
2      Alice
3      Bob

Orders

ID     CustomerID   Status
==========================
1001       1          1
1002       2          1
1003       2          2
1004       3          2

I'd like to join tables showing one entry per customer only (the one with lowest Status) i.e.

ID     Name    OrderID 
======================
1      John     1001
2      Alice    1002
3      Bob      1004

Thanks to the answer to this question, I chose 2 solutions which produce the same output:

Solution 1

SELECT c.id, c.name, o.id FROM customers AS c
INNER JOIN orders AS o ON
c.id =  o.customerid
WHERE o.status = (SELECT MIN(status) FROM orders WHERE customerid = c.id)

Solution 2

SELECT c.id, c.name, o.id FROM customers as c
INNER JOIN orders AS o ON
o.id = (SELECT TOP 1 id FROM orders WHERE customerid = c.id ORDER BY status)

Trying to understand which one runs faster, I used SQL Fiddle View Execution Plan which gave the following:

Solution 1

Solution 1

Solution 2

Solution 2

How to interpret those diagrams and which one performs faster? Using MS SQL Server 2016.

kuma
  • 121
  • 6
  • The execution plan doesn't tell you how fast it runs. You want [set-statistics-time](https://learn.microsoft.com/en-us/sql/t-sql/statements/set-statistics-time-transact-sql?view=sql-server-ver15) for that. – Dale K Nov 27 '20 at 00:35
  • Dale is correct (and/or use recent versions of SSMS, which include times on the Actual execution plans). However, I'm pretty certain the first one will be faster - the first solution has 2 table scans, while the second solution has 3 table scans, two of which are within a nested loop join (e.g., it may be run many many times). – seanb Nov 27 '20 at 01:48

1 Answers1

0

Here's my breakdown and the last one is my suggestion to you.

Query Cost 67%

SELECT c.id, c.name, o.id FROM @Customers AS c
INNER JOIN @Orders AS o ON
c.id =  o.customerid
WHERE o.status = (SELECT MIN(status) FROM @Orders WHERE customerid = c.id)

Query Cost 66%

SELECT c.id, c.name, o.id FROM @Customers as c
INNER JOIN @Orders AS o ON
o.id = (SELECT TOP 1 id FROM @Orders WHERE customerid = c.id ORDER BY status)

Query Cost 47%

SELECT
  x.CustID,
  x.Name,
  x.OrderID
FROM (SELECT
  C.id CustID,
  c.Name,
  o.ID OrderID,
  o.status,
  ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY o.status) rn
FROM @Customers c
INNER JOIN @Orders o
  ON o.CustomerID    = c.ID) x
WHERE x.rn = 1
Dharman
  • 30,962
  • 25
  • 85
  • 135
RoMEoMusTDiE
  • 4,739
  • 1
  • 17
  • 26
  • Thanks. But what query cost mean? Percentage of what? – kuma Nov 28 '20 at 00:04
  • @kuma Query cost is what optimizer thinks of how long your query will take (relative to total batch time). The optimizer tries to choose the optimal query plan by looking at your query and statistics of your data, trying several execution plans and selecting the least costly of them – RoMEoMusTDiE Nov 29 '20 at 21:36
  • sorry it's still unclear what 100% represents in these 3 cases – kuma Dec 01 '20 at 17:46
  • @kuma I take it you don't understand english very well. to simplify, Ideal query performance means less query cost. – RoMEoMusTDiE Dec 02 '20 at 00:11
  • this has nothing to do with english. As the percentage is relative to the batch, _what is the batch_ in your examples? If the batch comprises the 3 queries, their query costs don't add up to 100% as [it should be](https://stackoverflow.com/questions/3191356/query-cost-relative-to-batch-is-100) – kuma Dec 03 '20 at 19:53
  • @kuma that's what you did not get... Query Cost is simply the amount of effort SQL performed your batch request.. less cost meaning faster query. this is not math where need to add something up to 100%. – RoMEoMusTDiE Dec 03 '20 at 20:09