Which is efficient? query with subquery or joined table

Question

Here are two example of query for the same purpose (in this example, I want to fine salesman from same city where Patriks lives.)

select * 
from salesman 
where salesman.city = 
     ( select city 
       from salesman 
       where salesman_name='Patriks'
     );

and

select s1.* 
from salesman s1,salesman s2 
where s1.city=s2.city 
  and s2.salesman_name='Patriks';

Which is better or efficient and why? (I know this is small example, but I want to figure out, which will be good in complex situation, and for big database.)

http://stackoverflow.com/questions/2577174/join-vs-subquery – amd Feb 06 '12 at 08:07 — amd, Feb 06 '12 at 08:07

score 5 · Answer 1 · edited Feb 06 '12 at 08:38

My experience is that in Oracle, a flattened query (that is, the one with the join) is often more efficient than an equivalent query using a subselect. It seems that in the more complex cases, there are query paths that the Oracle optimiser doesn't find, for a query with a subselect.

In SQL Server, DB2, Ingres and Sybase, my experience is that it makes no difference - these DBMSs have optimisers that will find the same query paths, regardless of whether you use a flattened query or a query with a subselect.

I don't have sufficient experience of other DBMSs to comment on these.

But that's just my experience. I wouldn't be too surprised if you find different results for particular queries, or particular sets of data. The best thing to do is to try out both and see which query works better, for your situation.

score 5 · Accepted Answer · answered Feb 06 '12 at 08:26

5

As a general rule of thumb:

If you use a subquery, you force Oracle to use a certain execution path (ie it must execute the sub-query before it can execute the outer query)

If you use a join, Oracle is free to pick whichever it considers to be the most efficient path.

I would always go for the join over the subquery therefore. YMMV.

answered Feb 06 '12 at 08:26

cagcowboy

30,012
11
69
93

I don't understand: **why** does a subquery force Oracle to use a certain execution path? If you don't know why, what makes you think this is generally true? – onedaywhen Feb 06 '12 at 08:37
2

Because you can't work on the outer bit until you've got the result of the inner bit. (As an example think of it like x = (5*(4+2)). You have to do the inner (4+2) before you can do the outer multiplication.) Note that there's a chance the optimiser might be able to work around the inner/outer structures, but in my experience it's frequently not the case. – cagcowboy Feb 06 '12 at 08:43
Keeping with your analogy, the optimizer should be able to transform it by removing the 'inner' parens i.e. x = (5*4+5*2). – onedaywhen Feb 06 '12 at 10:03
2

The fact that it should doesn't always mean that it does. – cagcowboy Feb 06 '12 at 10:30
OK, I'm going to downvote this because your "general rule of thumb" claim is unsubstantiated. I've upvoted David Wallace's answer because they have made their position clearer. – onedaywhen Feb 06 '12 at 11:18
2

I have upvoted cagcowboy's answer, because it's better than mine. He/she has at least given a heuristic to suggest which query plans Oracle will and won't find; which I was unable to do (other than to say that Oracle won't find all query plans). As far as "why" Oracle works that way; well who, other than the designers of Oracle, can really say? – Dawood ibn Kareem Feb 07 '12 at 08:19

Branko Dimitrijevic · Answer 3 · 2012-02-08T12:09:24.913

In my experience, as long as subquery and JOIN mean the same thing, Oracle will execute them equally fast.

People often rewrite their queries and think they have stayed equivalent, when in fact they have introduced subtle differences. For example, whether the OP's queries mean the same thing depends on whether salesman_name is PRIMARY KEY (or UNIQUE). If it isn't, these queries no longer mean the same thing.

That being said, I wouldn't be too surprised if there really were some cases (as others have indicated), where Oracle actually does produce substantially different execution plans. Your mileage may vary, but as always - measure on representative amounts of data and don't just assume blindly one way or the other.

score 2 · Answer 4 · answered Feb 08 '12 at 01:29

There is simply no answer to this question. Even if your table structure doesn't change, queries can get different execution paths over time depending on the amount of data, the indexes, the constraints, because of bind variable peeking, and scads of other factors. Entire books have been written on the subject.

cagcowboy's answer is incorrect. Oracle will rewrite your query to provide what it thinks is the best execution plan. Queries like the one you describe are often transformed by subquery unnesting. My guess is that 9 times out of 10, queries similar to the ones you describe will have the same execution plan.

In my opinion, start with what is most readable, what will make it clearest to someone else reading your code (or yourself, reading it six months from now), what your intent is. If your query runs unacceptably slow, only then try to optimize it.

As Branko Dimitrijevic points out, two queries you think are the same often aren't equivalent. In your two examples, if salesman_name is not unique, your first query will throw a ORA-01427: single-row subquery returns more than one row exception, but your second example will work fine.

> no issue, its just example, I can use there `in` in place of `=` if it is supposed to return more than one row. — Patriks, Feb 08 '12 at 04:49
in some case _sub-query_ is better than joining to tables, once i faced > when, there were thousands of rows in tables, and i used query ( as of second one in my question) it was taking about a minute to retrieve data, then i replaced it with _sub-query_ type query it was good. — Patriks, Feb 08 '12 at 04:56
i think it happen because `select s1.* from salesman s1,salesman s2 where s1.city=s2.city and s2.salesman_name='Patriks';` in this example, sql first needs to compare all rows of s1 with rows of s2 to find matching rows to join with. so it needs to match x * x times (x=no. of rows) and in the case of `select * from salesman where salesman.city = ( select city from salesman where salesman_name='Patriks' );` it just needs to compare out side's rows with only one row returned by inner query. So in that case i found sub_query better — Patriks, Feb 08 '12 at 04:57

Which is efficient? query with subquery or joined table

4 Answers4