I am quite confused by something I'm seeing in an Oracle 10 database.
I have the following query.
select
t2.duplicate_num
from table1 t1, table2 t2,
(
select joincriteria_0 from intable1 it1, intable2 it2
where it2.identifier in (4496486,5911382)
and it1.joincriteria_0 = it2.joincriteria_0
and it1.filter_0 = 1
) tt
where t1.joincriteria_0 = tt.joincriteria_0
and t2.joincriteria_1 = t1.joincriteria_1
and t2.filter_0 = 3
and t2.filter_1 = 1
and t2.filter_2 not in (48020)
It doesn't really seem like anything special to me, here are the baseline performance numbers from autotrace:
CR_GETS: 318
CPU: 3
ROWS: 33173
Now if I add the 'DISTINCT' keyword to the query (e.g. 'select distinct t2.duplicate_num...') this happens
CR_GETS: 152921
CPU: 205
ROWS: 305
The query plan has not changed, but the logical IO grows by a factor of 500. I was expecting CPU only to go up and logical IO to be largely unchanged.
The net result is a query that runs 10-100x slower with the distinct keyword. I can put code into the applciation which would make the result set distinct in a fraction of the time. How does this make any sense? particularly without the query plan changing?