6

Can anyone tell me which table is considered to be the inner one in a nested loop join? For example if the query is from a inner join b on..., which one, a, or b will be considered inner? I knew that it is b, but from the article at dbsophic, the first example under Small outer loop with a well indexed inner input seems to suggest the reverse.

SexyBeast
  • 7,913
  • 28
  • 108
  • 196
  • 2
    You are confusing two different concepts. "Nested Loops" are an implementation operator, wheras an "inner join" is a language operator. The term "inner" has a different meaning depending on which one you are talking about. – RBarryYoung Aug 29 '12 at 12:22

3 Answers3

8

To be sure...

  • "INNER JOIN" is a the logical (relational) join operator
  • Inner and outer tables are concepts in the physical nested loop join operator

The choice of inner and outer tables for the physical operator is made by the optimiser and is unrelated to the logical operator.

Now, the nested loop psudeo code is this

for each row R1 in the outer table
    for each row R2 in the inner table
        if R1 joins with R2
            return (R1, R2)

So it doesn't make a difference in theory.

In practice, the optimiser will work out the best way around for inner and outer tables: which is what your article link should describe. A.k.a how to reduce the number of iterations

For completeness... INNER JOIN logical operator is commutative and associative
So A INNER JOIN B is the same as B INNER JOIN A.
There is no inner and outer table here

gbn
  • 422,506
  • 82
  • 585
  • 676
  • Yes, this is exactly what I was trying to say in my comment. – RBarryYoung Aug 29 '12 at 12:23
  • Thanks. But in so many places I have found that in order to optimize queries, the join order is efficient. Have a look at the article at http://docs.oracle.com/html/A95912_01/wn32tune.htm under the **Bypassing the Query Optimizer** heading. – SexyBeast Aug 29 '12 at 12:24
  • 2
    @Cupidvogel: "Bypassing the Query Optimizer" means you know better than the experts who write the RDBMS. Unlikely. Also, note that I don't think about the underlying physical operators very much at all day to day. Nor do I 2nd guess the optimiser much – gbn Aug 29 '12 at 12:27
  • 1
    @Cupidvogel: If you are worried about the performance, the following actions will be enough for most cases: (1) Make sure you have the right indexes in place and (2) make sure the database statistics are up to date. – Daniel Hilgarth Aug 29 '12 at 12:29
  • So you are saying that column orders don't matter at all in inner joins? – SexyBeast Aug 29 '12 at 12:30
  • 1
    @Cupidvogel: you mean table order? no. You have to separate logical and physical – gbn Aug 29 '12 at 12:31
  • Yeah yeah I meant table order, that is if `a natural join b natural join c` will always yield the same performance as `b natural join a natural join c` or other permutations... – SexyBeast Aug 29 '12 at 12:33
  • 3
    @Cupidvogel: Table order matters (for performance) at the physical (Implementation) level, but not at the logical (Language) level. The optimizer translates the logical level (Language) into the physical level (implementation) choosing a physical order that it thinks will perform best. The order that you use in the language (logical) doesn't matter, unless you choose to override the optimizer, which is not advised unless you are an expert. – RBarryYoung Aug 29 '12 at 12:35
  • @Cupidvogel: Yes it will. http://en.wikipedia.org/wiki/Join_(SQL)#Implementation. Note that Natural join is dangerous: use `INNER JOIN.. ON` or `INNER JOIN .. USING` See http://stackoverflow.com/a/4826659/27535 and http://en.wikipedia.org/wiki/Join_(SQL)#Natural_join – gbn Aug 29 '12 at 12:36
  • So what is the selection criteria for the inner and outer tables. which table should be large and which should be small? – Ahmed Nawaz Khan Mar 10 '19 at 16:06
2

Actually, both tables are inner as only rows are returned if there is a match in both tables.
When doing an outer join, you specify which table should be the outer one:

  • left outer join: The first table is the one for which all rows are returned and the second table is the one for which only the matching rows are returned.
  • right outer join: The second table is the one for which all rows are returned and the first table is the one for which only the matching rows are returned.
  • full outer join: All rows from both tables are returned.
Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
  • Ummm, there ia table where for each row, matching rows are searched in another table. The first table is the outer one, the 2nd one is the inner, right? – SexyBeast Aug 29 '12 at 12:18
  • I don't understand that comment, sorry. – Daniel Hilgarth Aug 29 '12 at 12:20
  • I meant that in `a inner join b`, isn't it that for each row in `a`, a matching row is searched in `b`, effectively making `a` the outer one in the loop and `b` the inner one? – SexyBeast Aug 29 '12 at 12:30
  • @Cupidvogel: I think gbn already said everything about this topic. Effectifly, you can't know which one will be in the loop and which won't as the optimizer is free to decide. – Daniel Hilgarth Aug 29 '12 at 12:31
0

I wonder if we need to separate the thinking into "outer join" and "inner loop join".

For outer join, there is a convention (tradition?) that names LEFT of LEFT JOIN as "outer table". (see below from Sybase) It seems syntactical designation. (Daniel was explaining this.)

For inner join there is no such distinction, but for nested loop join, there is a need to determine which table becomes the driving table, and it is determined by the optimizer.

Sybase has an article describing inner and outer tables. http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc32300.1570/html/sqlug/sqlug153.htm "The terms outer table and inner table describe the placement of the tables in an outer join"

benik9
  • 151
  • 4