1

i am doubting about this case, but not clear why.

consider the following sql :

create table t1(tid int not null, t1 int not null);
create table t2(t2 int not null, tname varchar(30) null);
create unique index i_t2 on t2(t2);
create or replace view v_1 as
select t1.tid,t1.t1,max(t2.tname) as tname
from t1 left join t2
on t1.t1 = t2.t2
group by t1.tid,t1.t1;

then check the execution plan for select count(1) from v_1, the t2 is eliminated by the optimizer:

SQL> select count(1) from v_1;


Execution Plan
----------------------------------------------------------
Plan hash value: 3243658773

----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |     1 |       |     3  (34)| 00:00:01 |
|   1 |  SORT AGGREGATE      |           |     1 |       |            |          |
|   2 |   VIEW               | VM_NWVW_0 |     1 |       |     3  (34)| 00:00:01 |
|   3 |    HASH GROUP BY     |           |     1 |    26 |     3  (34)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T1        |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------

but if the index i_t2 is dropped or recreated without unique attribute,

the table t2 is not eliminated in execution plan:

SQL> drop index i_t2;

Index dropped.

SQL> select count(1) from v_1;


Execution Plan
----------------------------------------------------------
Plan hash value: 2710188186

-----------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |     1 |       |     5  (20)| 00:00:01 |
|   1 |  SORT AGGREGATE       |           |     1 |       |            |          |
|   2 |   VIEW                | VM_NWVW_0 |     1 |       |     5  (20)| 00:00:01 |
|   3 |    HASH GROUP BY      |           |     1 |    39 |     5  (20)| 00:00:01 |
|*  4 |     HASH JOIN OUTER   |           |     1 |    39 |     4   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| T1        |     1 |    26 |     2   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL| T2        |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

it seems even if the index is removed,

the result of select count(1) from v_1 also equal to select count(1) from (select tid,t1 from t1 group by tid,t1)

why the optimizer does not eliminate t2 in the second case?

is there any principle or actual data example discribing this? thanks :)

yaoweijq
  • 263
  • 2
  • 12

1 Answers1

2

This is an optimization called join elimination. Because t2.t2 us unique, the optimizer knows that every row retrieved from t1 can only ever retrieve one row from t2. Since there is nothing projected from t2, there is no need to perform the join. If you do

select tid, t1 from v_1;

you will see that we do not perform the join. However, if we project from t2, then the join is needed.

BobC
  • 4,208
  • 1
  • 12
  • 15
  • Thanks. I am clear about the first case. And i am not clear with the second case that why t2 is not eliminated. – yaoweijq Feb 20 '20 at 03:05
  • @BobC "recreated without unique attribute" That's why. You said "Because t2.t2 us unique" but then the OP canceled that assumption. – Jeff Holt Feb 20 '20 at 03:10
  • In the 2nd case, the optimizer does not know how many rows will be produced by the join. Therefore the join needs to be completed. – BobC Feb 20 '20 at 03:13
  • @BobC the view has group by clause – yaoweijq Feb 20 '20 at 03:16
  • @yaoweijq Since your outer query is getting only a count of grouped rows, there is still no need to join to t2 even if the t2 index is non-unique. If I'm wrong, I'm sure Bob will say why it's necessary. But for now, I think you've simply uncovered an optimization that Oracle has yet to institute. I guess it could be a bug but I wouldn't count on anyone admitting it. You may want to look at [this](https://stackoverflow.com/q/40490220/1707353) as your usage is a bit like a subquery. – Jeff Holt Feb 20 '20 at 03:39
  • @JeffHolt Thanks, i will carefuly view your link.:) – yaoweijq Feb 20 '20 at 03:47
  • @yaoweijq, I think you are right that we could eliminate the join in this case. Is this a real world use case or just a science experiment? – BobC Feb 20 '20 at 16:23
  • @BobC, the case is from one real product case. – yaoweijq Feb 21 '20 at 01:41
  • @BobC "we could eliminate the join"..... It seems you are from oracle? :) – yaoweijq Feb 21 '20 at 02:25
  • Yes. What is the impact in the real world? Is the unique index a viable option? Another option (which I have yet to test) would be a unique constraint, defined as RELY, DISABLE, NOVALIDATE. – BobC Feb 21 '20 at 02:28
  • @BobC The left join && group by is in a view as the example.And the product issue sql as select count(1) from view_name. If t2 is not eliminated by optimizer, there maybe performance impact. And i am not very concern about this as the product may change sql . I care about what principle forbide optimizer do the elimnation. Is it very hard or just a bug or just an ignored little mistake? – yaoweijq Feb 21 '20 at 02:58
  • It's not a bug. I would look at it as an optimization that has not been implemented, as opposed to something that forbids the optimizer. It does looks like a fairly special corner case. – BobC Feb 21 '20 at 03:05