0

I have a query which take data from 5 huge table, could you please help me with performance tuning of this query :

SELECT DECODE(SIGN((t1.amount - NVL(t2.amount, 0)) - 4.999), 1, NVL(t2.amount, 0), t1.amount) AS amount_1,
       t1.element_id,
       t1.start_date ,
       t1.amount,
       NVL(t5.abrev, NULL) AS criteria,
       t1.case_id ,
       NVL(t5.value, NULL) segment,
       add_months(t1.start_date, -1) invoice_date,
       NVL((SELECT SUM(b.amount)
             FROM TABLE1 a, TABLE3 b
            WHERE a.element_id = b.element_id
              AND b.date_invoicing < a.start_date
              AND t1.element_id = a.element_id),
           0) amount_2
  FROM TABLE1 t1, TABLE2 t2, TABLE3 t3, TABLE4 t4, TABLE5 t5
 WHERE t1.TYPE = 'INVOICE'
   AND t2.case_id = t3.case_id
   AND t2.invoicing_id = t3.invoicing_id
   AND t2.date_unpaid IS NULL
   AND t1.element_id = t3.element_id(+)
   AND add_months(t1.start_date, -1) <
       NVL(t4.DT_FIN_DT(+), SYSDATE)
   AND add_months(t1.start_date, -1) >= t4.date_creation(+)
   AND t1.case_id = t4.case_id(+)
   AND t4.segment = t5.abrev(+)
   AND t5.Type(+) = 'CRITERIA_TYPE';

is there something wrong and could be replaced with something else?
Thanks for your help

mcha
  • 2,938
  • 4
  • 25
  • 34
  • 2
    As it stands you have an attempted outer join to t3 using (+) that get forced back to an inner join by other predicates that don't use (+) such as "t2.case_id = t3.case_id". If you could write this using ANSI joins it would be less error-prone and easier to read. That hasnothing to do with performance though. – Tony Andrews Dec 02 '10 at 14:16

5 Answers5

2

The first thing you must do is to use Explicit Joins. This will separate your joins from your filters and will help you tune this better.

Please check if these joins are correct.

SELECT 
    DECODE(SIGN((t1.amount - NVL(t2.amount, 0)) - 4.999), 1, NVL(t2.amount, 0), t1.amount) AS amount_1,
    t1.element_id,
    t1.start_date ,
    t1.amount,
    NVL(t5.abrev, NULL) AS criteria,
    t1.case_id ,
    NVL(t5.value, NULL) segment,
    add_months(t1.start_date, -1) invoice_date,
    NVL
    (
        (SELECT SUM(b.amount)
        FROM TABLE1 a, TABLE3 b
        WHERE a.element_id = b.element_id
        AND b.date_invoicing < a.start_date
        AND t1.element_id = a.element_id),
    0) amount_2
FROM 
    TABLE1 t1

    LEFT OUTER JOIN TABLE3 t3
        on t1.element_id = t3.element_id

    INNER JOIN TABLE2 t2, 
        on t2.invoicing_id = t3.invoicing_id
        and t2.case_id = t3.case_id

    LEFT OUTER JOIN TABLE4 t4 
        on t1.case_id = t4.case_id

    LEFT OUTER JOIN TABLE5 t5
        on t4.segment = t5.abrev

WHERE t1.TYPE = 'INVOICE'
    AND t2.date_unpaid IS NULL
    AND add_months(t1.start_date, -1) < NVL(t4.DT_FIN_DT(+), SYSDATE)
    AND add_months(t1.start_date, -1) >= t4.date_creation(+)
    AND t5.Type(+) = 'CRITERIA_TYPE';

If they are, then you can do several things, but the best thing is to look at the execution plan.

Raj More
  • 47,048
  • 33
  • 131
  • 198
  • 2
    In terms of performance, they should be exactly the same. Separation helps you identify things to tune up much better. – Raj More Dec 02 '10 at 14:22
  • +1 for suggesting ANSI-style joins, but those should really be `LEFT OUTER JOIN` instead of just `OUTER JOIN`. – BQ. Dec 02 '10 at 18:55
  • @BQ I corrected that. It used to be one of my favourite questions to ask in an interview - difference between LEFT JOIN and LEFT OUTER JOIN. It definitely threw a lot of people off, and in some cases we got some really creative answers. :) – Raj More Dec 02 '10 at 21:31
1

The optimizer may have produced a suboptimal execution plan. Or it may very well be running as fast as possible given the amount of work the database actually need to do. Without explain plan, knowing the keys, relations and indexes it is a bit hard to tell what is going on.

Scalar subqueries in the select list is usually not a good idea when the outer query returns a large nr of rows.

The following expressions may prevent the optimizer from using the statistics because of the function calls. Indexes would probably not be used either for the same reason.

AND add_months(t1.start_date, -1) < NVL(t4.DT_FIN_DT(+), SYSDATE)
AND add_months(t1.start_date, -1) >= t4.date_creation(+)

Can't really be more specific than that :)

Ronnis
  • 12,593
  • 2
  • 32
  • 52
1

You need to learn about how to view and understand execution plans. This previous question is a good place to start.

Community
  • 1
  • 1
Dave Costa
  • 47,262
  • 8
  • 56
  • 72
1

As others have noted, it's hard to tell without looking at the execution plan.

But... some things I'd be concerned with:

  1. The outer join to TABLE3 in the main query isn't complete as @TonyAndrews mentioned in his comment above. See the "Incomplete Join Trail" example on Common errors seen when using OUTER-JOIN. This means your query is probably producing the wrong results, but without knowing the full intent of the query and the schema, no one but you could know this for sure.

    Updating your query to use the ANSI-style INNER/[LEFT|RIGHT] OUTER syntax from the Oracle-style TableName.ColumnName(+) will help make this more apparent.

  2. The scalar subquery will get run for every row and may be slow (assuming TABLE3 is large). It will be extremely slow if there's not a useful index on TABLE3.element_id and TABLE3.date_invoicing:

    NVL((SELECT SUM(b.amount)
         FROM TABLE1 a, TABLE3 b
         WHERE a.element_id = b.element_id
           AND b.date_invoicing < a.start_date
           AND t1.element_id = a.element_id),
        0) amount_2
    

    As such, I'm not seeing a need to include TABLE1 again in this subquery. It may be better to refactor this into:

    NVL((SELECT SUM(b.amount)
         FROM TABLE3 b
         WHERE t1.element_id = b.element_id
           AND b.date_invoicing < t1.start_date,
        0) amount_2
    

    Or, you may even be better off refactoring this to use an analytical function (SO question, Oracle documentation) if the criteria for summing the b.amount values is the same as that for including them in the query in the first place:

    SUM(b.amount) OVER (PARTITION BY b.element_id) amount_2
    

    Obviously, you currently have different criteria for summing b.amount since you're joining to TABLE3 differently in the main query and the subquery, but I'd imagine that's more a factor of the "Incomplete Join Trail" than by purposeful design (a guess on my part, as I can't tell the intent of the query from the code itself).

Community
  • 1
  • 1
BQ.
  • 9,393
  • 3
  • 25
  • 35
0

It's abit weird here when you nest Select statement inside another

NVL((SELECT SUM(b.amount)
             FROM TABLE1 a, TABLE3 b
            WHERE a.element_id = b.element_id
              AND b.date_invoicing < a.start_date
              AND t1.element_id = a.element_id),
           0) amount_2

you need to write again as a table and join after "From".