1

Assume two tables:

TRANSACTION
Primary Key: REF_NO
Columns:     REF_NO, TXN_DATE, ITEM_CODE, QUANTITY

ITEM
Primary Key: ITEM_CODE
Columns:     ITEM_CODE, ITEM_DESC

Query (1):

SELECT T.REF_NO, T.TXN_DATE, T.ITEM_CODE,
    I.ITEM_DESC,
    T.QUANTITY
FROM TRANSACTION T, ITEM I
WHERE T.ITEM_CODE = I.ITEM_CODE

Query (2):

SELECT T.REF_NO, T.TXN_DATE, T.ITEM_CODE,
    (SELECT ITEM_DESC FROM ITEM WHERE ITEM_CODE = T.ITEM_CODE) AS ITEM_DESC,
    T.QUANTITY
FROM TRANSACTION T

Indices (indexes) are on both tables as necessary.

The above is a very simplified version of the stuff I'm doing, but the concept is the same.

I was told that (1) is more efficient due to indices, and Explain Plan actually suggests that it is. Explain plan for (1) shows index access on both tables. Explain plan for (2) shows index access on ITEM, but full table access on TRANSACTION.

But my dilemma is when I run them on a very large set of data to time the actual performance, (2) is four times faster than (1)! What are the possible reasons for this? Why should I choose (1) over (2)? (We decided to choose (2) over (1).)

ADTC
  • 8,999
  • 5
  • 68
  • 93
  • I missed out on the cardinality but cost was a lot higher on (2), which is very puzzling. _(Not at the system to test again.)_ – ADTC May 23 '13 at 09:10

1 Answers1

1

You most likely profited from Scalar Subquery Caching. I've recently blogged about this awesome feature addition to Oracle 11g (or 10g?):

http://blog.jooq.org/2011/09/02/oracle-scalar-subquery-caching

Check out your execution plan, you'll find some curious elements at the top of the plan, indicating that the actual subquery is not really evaluated for every row originating from the TRANSACTIONS table. This is because Oracle's CBO introspects constraint metadata to deduce that for every row in TRANSACTIONS there can only be one matching row in ITEMS, if equi-joined using ITEM_CODE. This knowledge makes your subquery subject to caching. If you have many equal values for TRANSACTIONS.ITEM_CODE, caching can have a very positive effect.

More useful information can be found here:

Community
  • 1
  • 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • Problem is, the DB I was testing on was Oracle 10g. Also, does subquery caching happen even if you have `WHERE ITEM_CODE = T.ITEM_CODE`? Asking since the data in each row for this column in T can be different. – ADTC May 23 '13 at 09:16
  • Hmm, I'm actually not sure anymore, when it was introduced. Can't seem to find any authoritative source. Anyway, yes it works precisely because you have `ITEM_CODE = T.ITEM_CODE`. Constraint meta data plays a role for Oracle to decide that this particular subquery is cacheable. I.e. Oracle knows, that for every row in `TRANSACTION`, there can only be a single row in `ITEM` if joined using `ITEM_CODE`. So if you have many similar values in `T.ITEM_CODE`, caching can have positive effects. – Lukas Eder May 23 '13 at 09:24
  • I think it might have been introduced in 10g. And reading through your oracle.com link, I understand it explains the higher speed of (2), as you explain in your comment. _(Consider moving your explanation in the comment to your answer's body.)_ – ADTC May 23 '13 at 09:29