3

Background

Recently my company upgraded from Oracle 11g to Exadata. All production objects and data were successfully migrated to the new system and were verified to be a 1:1 match between systems. Immediately after the first daily set of ETL processes were completed on our new system we discovered our reporting tables were significantly smaller than expected. Upon further investigation we discovered that a batch ID being added to LEFT OUTER joins were causing the problem where they had worked perfectly on 11g.

Question

Why would the following query be treated as a LEFT OUTER JOIN on 11g, but an INNER JOIN on Exadata?

SELECT 
    *
FROM DIM_CALL CALLS
LEFT OUTER JOIN FACT_ACTVY_MNGMT ACTVY_MNGMT
    ON ACTVY_MNGMT.CALL_KEY = CALLS.CALL_KEY
    AND ACTVY_MNGMT.BATCH_ID = 20141112
LEFT OUTER JOIN DIM_ACTVY ACTVY
    ON ACTVY.ACTVY_KEY = ACTVY_MNGMT.ACTVY_KEY
    AND ACTVY_MNGMT.BATCH_ID = 20141112

Update

It appears there was either a typo in the query used in the ETL process or the original developer just overlooked this case. If you look at the last join you'll notice that the join is on the ACTVY_KEY and BATCH_ID. The issue is that the BATCH_ID it's referencing is from the ACTVY_MNGMT table. The database essentially treats this as a WHERE clause, so any case in which CALL_KEY is NULL results in a failure.

  • I don't know why it would treat it like an inner join, but a possible solution would be to allow nulls in the where clause: SELECT * FROM DIM_CALL CALLS LEFT OUTER JOIN FACT_ACTVY ACTVY ON ACTVY.CALL_KEY = CALLS.CALL_KEY WHERE ACTVY.{PRIMARY_KEY} IS NULL OR ACTVY.BATCH_ID = 20141112 – CactusCake Nov 12 '14 at 18:33
  • 1
    Are you sure that is what is actually being sent to the database? Have you profiled it? Could it be translating it as Where ACTVY.BATCH_ID = 20141112 which would turn it into an inner join. – HLGEM Nov 12 '14 at 18:45
  • Thanks HLGEM! You pointed me in the right direction and I figured out what the issue was. The example that I gave isn't detailed enough to cause the issue that I encountered, so I'm going to update the post to reflect the true case. – Chad Stewart Nov 12 '14 at 19:52

2 Answers2

1

It appears there was either a typo in the query used in the ETL process or the original developer just overlooked this case. If you look at the last join you'll notice that the join is on the ACTVY_KEY and BATCH_ID. The issue is that the BATCH_ID it's referencing is from the ACTVY_MNGMT table. The database essentially treats this as a WHERE clause, so any case in which CALL_KEY is NULL results in a failure.

1

There is a minor programming mistake here, but it also looks like there is an optimizer bug. Think of a LEFT JOIN as operating on the previous set, not just the previous table. Adding a duplicate predicate to a different LEFT JOIN should not make any difference.

My guess is there is an optimizer or parser bug with some fancy Exadata feature, such as smart scan.

This code is similar to your problem. However, it does not reproduce the error for me, on 11.2.0.3 EE, on Solaris. And it may not even reproduce the error for you, on Exadata. It's just meant to show that the problem you've described is a bug. Avoiding the bug, like you're doing, is usually the best approach. But you may also want to create a service request with Oracle support to look into the issue. That same bug may be affecting other code in less obvious ways.

with table1 as (select '1' a, '1' b from dual),
     table2 as (select '1' a, '2' b from dual),
     table3 as (select '1' a, '2' b from dual)
select *
from table1
left join table2
    on table1.a = table2.a
    and table2.b = 3
left join table3
    on table2.a = table3.a;

with table1 as (select '1' a, '1' b from dual),
     table2 as (select '1' a, '2' b from dual),
     table3 as (select '1' a, '2' b from dual)
select *
from table1
left join table2
    on table1.a = table2.a
    and table2.b = 3
left join table3
    on table2.a = table3.a
    --This predicate is logically redundant but does *not* change results.
    and table2.b = 3;

Both queries return this result on Oracle 11.2.0.3, EE, on Solaris:

A   B   A   B   A   B
-   -   -   -   -   -
1   1                
Jon Heller
  • 34,999
  • 6
  • 74
  • 132
  • Already submitted a ticket to our DBAs about the bug and they're going to get in touch with Oracle. Thanks for the response! – Chad Stewart Nov 12 '14 at 21:45
  • You're welcome. If you do create a bug or find an existing bug please add the relevant information here so it can help someone else. – Jon Heller Nov 12 '14 at 21:49