1

Inside an SP I need to find out the Id's of some clients of the first account whose Code matches any of the second account's clients. I wrote the following query that works -

    SELECT DISTINCT cil.Id FROM ClientIdList AS cil
    INNER JOIN Client AS c1
    ON cil.Id = c1.Id
    INNER JOIN Client AS c2
    ON c1.Code = c2.Code
    WHERE c2.AccountId = 2
    ORDER BY cil.Id  

Here ClientIdList is a single-column table-type variable which holds the Ids of the selected clients from the first account (and I need to use this variable for other requirements prior to this point). I can get the same correct output if I put the condition in the WHERE clause as the JOIN condition as follows -

    SELECT DISTINCT cil.Id FROM ClientIdList AS cil
    INNER JOIN Client AS c1
    ON cil.Id = c1.Id
    INNER JOIN Client AS c2
    ON c1.Code = c2.Code AND c2.AccountId = 2
    ORDER BY cil.Id  

Considering the 2000 accounts and 10000 clients per account (that is, 2000 x 10000 rows in Client table) which one would be an appropriate choice?

Can the query be optimized further to improve performance?

Edit : Actually the condition is c2.AccountId = @accountId where the @accountId is a parameter to the SP

Edit 2 : As much as I understand, with the WHERE clause version the JOIN will be performed with the rest of the Client table, and then the result will be filtered based on the WHERE condition. But with the later version the JOIN should be performed with a smaller set of rows for which the condition satisfies. Am i right? If so, shouldn't the later version give better performance?

atiyar
  • 7,762
  • 6
  • 34
  • 75
  • 2
    How about checking the execution plans? – toniedzwiedz Mar 02 '13 at 23:51
  • As @Tom says check the execution plans. They may be identical though. – Rob Mar 02 '13 at 23:53
  • @Tom: I checked the execution plan and the client statistics. But (may be because of the lack of my knowledge/experience about them) outputs seem same to me for both cases :( – atiyar Mar 02 '13 at 23:56
  • @Rob: they seems identical (almost) to me – atiyar Mar 02 '13 at 23:59
  • 1
    possible duplicate of [INNER JOIN ON vs WHERE clause](http://stackoverflow.com/questions/1018822/inner-join-on-vs-where-clause) – Andriy M Mar 03 '13 at 00:53
  • 1
    @AndriyM: the question is *NOT* about JOIN vs WHERE. its about the filtering conditions with WHERE clause which could otherwise be put as JOIN condition. *please READ a question-body before voting it for closing* – atiyar Mar 03 '13 at 01:27
  • You are right, that was the wrong choice, especially since that question seems to be specifically about MySQL. Sorry. I'll try to amend my mistake by offering an arguably more relevant link: [Condition within JOIN or WHERE](http://stackoverflow.com/questions/1018952/condition-within-join-or-where). – Andriy M Mar 03 '13 at 10:32

3 Answers3

1

Whatever, the db engine ends up with the same execution plan and there is no way to improve the writing of this query (not talking about indexes here).

As to best practices, I'll put the c2.AccountId = 2 into the WHERE because of the constant (this is something someone might edit to change the selection of the query, whereas the join is more behavior-related and should not be subject to this kind of modification).

But I usually see both of the writings and to put it in the JOIN doesn't shock me that much :-)

jazzytomato
  • 6,994
  • 2
  • 31
  • 44
  • Putting "c2.AccountId = 2" into the WHERE clause ia a bad habit because the semantics change when an outer join is involved. – Pieter Geerkens Mar 03 '13 at 00:14
  • Please see the Edit section – atiyar Mar 03 '13 at 00:18
  • @PieterGeerkens it wouldn't mean the same thing for an outer join I agree, but here it is an inner join and I stick to the idea of joining the two tables on the foreign keys and filtering on the parameter. But it is a matter of preference when it comes to inner join I suppose – jazzytomato Mar 03 '13 at 00:26
  • @ThomasHaratyk: can you please see the second Edit and add to your answer? – atiyar Mar 03 '13 at 01:46
  • As I said, the db engine will use the same execution plan for both version. This is why performance cannot be improved here (in the writing of the query). The sentence you wrote in your second edit is the way you read the query and understand it, it's not the way it is interpreted by the db engine – jazzytomato Mar 03 '13 at 10:00
1

Use the new JOIN syntax for several reasons:

  • The code is more readable.
  • It better describes the intent of the query, allowing more efficient /better optimization.
  • You will not run into problems with OUTER JOINS where the semantics differs for these clauses.
Pieter Geerkens
  • 11,775
  • 2
  • 32
  • 52
  • can you please see the second Edit and add to your answer? – atiyar Mar 03 '13 at 01:46
  • What if we add a third table that will also join with c2? Will you repeat the condition on join just in case it could be transformed into an outer join? The condition in WHERE clause scales better with many joins. – Jakub Kania Mar 03 '13 at 10:48
0

Both queries give the same performance. You can oprimize your query remove second JOIN

SELECT DISTINCT cil.Id 
FROM ClientIdList AS cil JOIN Client AS c ON cil.Id = c.Id                          
WHERE c.AccountId = 2
ORDER BY cil.Id 
Aleksandr Fedorenko
  • 16,594
  • 6
  • 37
  • 44