1

I have this query on the order lines table. Its a fairly large table. I am trying to get quantity shipped by item in the last 365 days. The query works, but is very slow to return results. Should I use a function based index for this? I read a bit about them, but havent work with them much at all.

How can I make this query faster?

select OOL.INVENTORY_ITEM_ID
    ,SUM(nvl(OOL.shipped_QUANTITY,0)) shipped_QUANTITY_Last_365
from oe_order_lines_all OOL
where ool.actual_shipment_date>=trunc(sysdate)-365
    and cancelled_flag='N'
    and fulfilled_flag='Y'
group by ool.inventory_item_id;

Explain plan:

enter image description here

Stats are up to date, we regather once a week.

Query taking 30+ minutes to finish.

UPDATE

After adding this index:

enter image description here

The explain plan shows the query is using index now: enter image description here

The query runs faster but not 'fast.' Completing in about 6 minutes.

UPDATE2

I created a covering index as suggested by Matthew and Gordon: enter image description here

The query now completes in less than 1 second.

Explain Plan: enter image description here

I still wonder why or if a function-based index would have also been a viable solution, but I dont have time to play with it right now.

alexherm
  • 1,362
  • 2
  • 18
  • 31

3 Answers3

4

As a rule, using an index that access a "significant" percentage of the rows in your table is slower than a full table scan. Depending on your system, "significant" could be as low as 5% or 10%.

So, think about your data for a minute...

  • How many rows in OE_ORDER_LINES_ALL are cancelled? (Hopefully not many...)
  • How many rows are fulfilled? (Hopefully almost all of them...)
  • How many rows where shipped in the last year? (Unless you have more than 10 years of history in your table, more than 10% of them...)

Put that all together and your query is probably going to have to read at least 10% of the rows in your table. This is very near the threshold where an index is going to be worse than a full table scan (or, at least not much better than one).

Now, if you need to run this query a lot, you have a few options.

  1. Materialized view, possibly for the prior 11 months together with a live query against OE_ORDER_LINES_ALL for the current month-to-date.
  2. A covering index (see below).

You can improve the performance of an index, even one accessing a significant percentage of the table rows, by making it include all the information required by the query -- allowing Oracle to avoid accessing the table at all.

CREATE INDEX idx1 ON OE_ORDER_LINES_ALL
  ( actual_shipment_date,
    cancelled_flag,
    fulfilled_flag,
    inventory_item_id,
    shipped_quantity ) ONLINE;

With an index like that, Oracle can satisfy the query by just reading the index (which is faster because it's much smaller than the table).

Matthew McPeak
  • 17,705
  • 2
  • 27
  • 59
1

For this query:

select OOL.INVENTORY_ITEM_ID,
       SUM(OOL.shipped_QUANTITY) as shipped_QUANTITY_Last_365
from oe_order_lines_all OOL
where ool.actual_shipment_date >= trunc(sysdate) - 365 and
      cancelled_flag = 'N' and
      fulfilled_flag = 'Y'
group by ool.inventory_item_id;

I would recommend starting with an index on oe_order_lines_all(cancelled_flag, fulfilled_flag, actual_shipment_date). That should do a good job in identifying the rows.

You can add the additional columns inventory_item_id and quantity_shipped to the index as well.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Let recapitulate the facts:

a) You access about 300K rows from your table (see cardinality in the 3rd line of the execution plan)

b) you use the FULL TABLE SCAN the get the data

c) the query is very slow

The first thing is to check why is the FULL TABLE SCAN so very slow - if the table is extremly large (check the BYTES in user_segments) you need to optimize the access to your data.

But remember no index will help you the get 300K rows from say 30M total rows.

Index access to 300K rows can take 1/4 of an hour or even more if th eindex is not much used and a large part of it s on the disk.

What you need is partitioning - in your case a range partitioning on actual_shipment_date - for your data size on a monthly or yearly basis.

This will eliminate the need of scaning the old data (partition pruning) and make the query much more effective.

Other possibility - if the number of rows is small, but the table size is very large - you need to reorganize the table to get better full scan time.

Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53
  • This is a standard table for Oracle EBS, so reorganizing the table is not an option. I'm not sure about the range partitioning either, I have a feeling something like that may break when a patch is installed. – alexherm Dec 19 '19 at 20:43
  • Table has ~10 million records. With the covering index the query is done in less than 1 second. – alexherm Dec 19 '19 at 21:55
  • @alexherm Be a little careful about the optimistic timing (< 1 second). A side effect of reading from a covering index is that the index blocks you read are read into the middle of the LRU cache (whereas blocks read via FTS are read in add the least-recently used end, where they age out immediately). That means, you may be benefiting from cache hits that might be unrealistic in Production if this query is run only infrequently. – Matthew McPeak Dec 19 '19 at 22:24
  • I checked that by opening a new session and running the query. It too completed in less than 1 second. I suppose a better test will be after next gather stats happens. – alexherm Dec 19 '19 at 22:28
  • I moved the index into production and ran the query there, where it had not been run at all in a few days. Also done in less than 1 second. – alexherm Dec 19 '19 at 22:41
  • @alexherm How many rows return this query: `select count(*) from oe_order_lines_all OOL where ool.actual_shipment_date>=trunc(sysdate)-365 and cancelled_flag='N' and fulfilled_flag='Y'` – Marmite Bomber Dec 20 '19 at 05:31
  • @Marmie - 352,989 rows – alexherm Dec 20 '19 at 15:22