3

I am quite confused by something I'm seeing in an Oracle 10 database.

I have the following query.

select
t2.duplicate_num
from table1 t1, table2 t2,

( 
   select joincriteria_0 from intable1 it1, intable2 it2 
   where it2.identifier in (4496486,5911382) 
   and it1.joincriteria_0 = it2.joincriteria_0 
   and it1.filter_0 = 1 
) tt

where t1.joincriteria_0 = tt.joincriteria_0
and t2.joincriteria_1 = t1.joincriteria_1
and t2.filter_0 = 3
and t2.filter_1 = 1
and t2.filter_2 not in (48020)

It doesn't really seem like anything special to me, here are the baseline performance numbers from autotrace:

CR_GETS: 318

CPU: 3

ROWS: 33173

Now if I add the 'DISTINCT' keyword to the query (e.g. 'select distinct t2.duplicate_num...') this happens

CR_GETS: 152921

CPU: 205

ROWS: 305

The query plan has not changed, but the logical IO grows by a factor of 500. I was expecting CPU only to go up and logical IO to be largely unchanged.

The net result is a query that runs 10-100x slower with the distinct keyword. I can put code into the applciation which would make the result set distinct in a fraction of the time. How does this make any sense? particularly without the query plan changing?

Evan Cross
  • 31
  • 1
  • 3
  • 1
    Good read- http://oracle-randolf.blogspot.com/2011/01/hash-aggregation.html – rs. Oct 01 '12 at 21:13
  • Interesting link rs - I have read it and it looks like it certainly could be applicable to my query. I am running 10.2.0.5 which seems to sometimes be affected for HASH (UNIQUE) calls. I tried adding some of the hints and they had no effect, but I'm certainly not ruling this out as a possibility. – Evan Cross Oct 01 '12 at 22:25

1 Answers1

0

This indicates a lack of index somewhere. It also means, your original query without the distinct clause wasn't optimized. With "distinct" also it could not be optimized, so the query plan remained the same. An unoptimized query varies widely in performance due to the full table scans.

srini.venigalla
  • 5,137
  • 1
  • 18
  • 29
  • If this is a production scenario, I would create the inner subquery as a view, and start with it and join the other two tables. – srini.venigalla Oct 01 '12 at 21:20
  • If it was an index problem, wouldn't I see the object access methods change in the query plan? The query plans are identical between the normal and 'distinct' keyword version, the ONLY difference being a top-level HASH(UNIQUE). But, every step in the autotrace is 10-500x more expensive in logical IO in the latter case. To address your question though I have all index range scan and index unique scan access methods. No full access. – Evan Cross Oct 01 '12 at 21:33