0

I have been trying to optimize a query by adding several indices. I was successful in most of them but there is this one query where the column I have created was never used. Query looks something like this:

SELECT DISTINCT
  c.containername
  , hml.qty
  , CAST(hml.txndate AS TIMESTAMP) txndate
  , hml.comments
  , e.fullname
  , 'ICO' txn
FROM con c
JOIN hml hml ON c.containerid = hml.historyid
JOIN emp e ON hml.employeeid = e.employeeid
WHERE hml.comments LIKE '%ICO%'
      AND hml.txndate >= to_date(:ddate, 'MM/DD/YYYY HH:MI:SS PM')
      AND e.notes IN ('DBP', 'FTD', 'MH', 'SUPERVISOR')
ORDER BY CAST(hml.txndate AS TIMESTAMP);

Execution Plan

    Plan hash value: 267728100

---------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                      |   116 | 14616 |   248K  (1)| 00:49:40 |
|   1 |  SORT ORDER BY                   |                      |   116 | 14616 |   248K  (1)| 00:49:40 |
|   2 |   HASH UNIQUE                    |                      |   116 | 14616 |   248K  (1)| 00:49:40 |
|   3 |    NESTED LOOPS                  |                      |       |       |            |          |
|   4 |     NESTED LOOPS                 |                      |   116 | 14616 |   248K  (1)| 00:49:40 |
|*  5 |      HASH JOIN                   |                      |   116 | 11252 |   248K  (1)| 00:49:37 |
|*  6 |       TABLE ACCESS FULL          | EMPLOYEE             |     8 |   304 |    25   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS BY INDEX ROWID| HISTORYMAINLINE      | 38373 |  2210K|   248K  (1)| 00:49:37 |
|*  8 |        INDEX RANGE SCAN          | HISTORYMAINLINEMX005 |   347K|       |  1248   (1)| 00:00:15 |
|*  9 |      INDEX UNIQUE SCAN           | CONTAINER450         |     1 |       |     1   (0)| 00:00:01 |
|  10 |     TABLE ACCESS BY INDEX ROWID  | CONTAINER            |     1 |    29 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("HML"."EMPLOYEEID"="E"."EMPLOYEEID")
   6 - filter("E"."NOTES"='DBP' OR "E"."NOTES"='FTD' OR "E"."NOTES"='MH' OR 
              "E"."NOTES"='SUPERVISOR')
   7 - filter("HML"."COMMENTS" IS NOT NULL AND "HML"."COMMENTS" LIKE '%ICO%')
   8 - access("HML"."TXNDATE">=TO_DATE(:DDATE,'MM/DD/YYYY HH:MI:SS AM'))
   9 - access("C"."CONTAINERID"="HML"."HISTORYID")

Cost is 248,283. Same with the cost before i put on the index. Additionally, when i change the hml.comments LIKE '%ICO%' to hml.comments = '%ICO%', cost improves really well (only 14!) because the index I have created is used. Unfortunately, I still have to used LIKE since I don't know the position where the tag ICO is used on the comments column.

Is there a way I can have the query be improved? I am using Oracle 11g and the hml table has a total rows of 40,193,106 by the way. Thanks in advance to all who will take time to answer.

CaLoL
  • 1
  • 5
  • How many rows out of 40M have ICO in comments field? – Kirill Leontev May 04 '15 at 02:31
  • Hi @beherenow I'm afraid I wouldn't be able to provide the accurate count. THis particular query will run twice a day to export data that contains ICO on the comments field. From what I have exported from July 2014 to date it was just 5369. – CaLoL May 04 '15 at 02:42
  • 1
    you don't need to know the accurate count but you need to know the estimate. if your `comments like '%ICO%'` condition filters out 1% out of 40M rows - ok, but if it filters out 20-50% out of 40M rows - the index won't speed up your query, it will actually slow it down. – Kirill Leontev May 04 '15 at 02:57
  • So what can you suggest to improve the query then, @beherenow ? – CaLoL May 04 '15 at 02:59
  • what exactly do you want to improve? cost of 121k on its own tells you nothing - it may or may not be an issue - you need to understand the context. does it run for an hour and you want it to run for a minute? ok, but in this case you need to provide some additional info - an execution plan with execution statistics - to figure out the bottleneck. absence of index on comments could be an issue - but it easily could be something else. – Kirill Leontev May 04 '15 at 03:21
  • You are correct @beherenow . This query runs slow like 3 to 5 minutes and I want it to speed up. I will be posting the execution plan then. Sorry if I have not put it originally. – CaLoL May 04 '15 at 03:30
  • 1
    http://oracle-base.com/articles/9i/dbms_xplan.php#gather_plan_statistics_hint take a look at this description how to get an actual execution plan with execution statistics. – Kirill Leontev May 04 '15 at 03:32
  • 1
    Is your index on the comments field a full text index (INDEXTYPE IS CTXSYS.CTXCAT)? http://stackoverflow.com/questions/7749407/how-to-create-text-index-for-abc-search A normal index won't help with a LIKE '%...%' – Dijkgraaf May 04 '15 at 03:52
  • you provided just a basic execution plan, which is nothing but a list of steps to be taken. it doesn't show amounts of data processed during each step. take a look at the link I provided – Kirill Leontev May 04 '15 at 04:12
  • Oracle cant' use an index for `LIKE '%ICO%'`. It can however use an index for the condition `hml.comments = '%ICO%'` but that will return something ***completely*** different. It searches for all comments that are (exactly) 5 characters long, start with a `%` sign, then contain `ICO` and end with a `%`. There is no way to speed up the "search everything" query that the ``LIKE '%ICO%'` does. The only way to speed this up is to use a Oracle full text capabilities –  May 04 '15 at 06:00
  • 1
    Assuming the use-case is only for 'ICO' being in there, you could add a calculated field that flags the presence of the string and then put an index on that calculated field, but I have a gut-feeling that something true/false on 40M rows won't be considered useful too quickly by the system. That said, you could add it to the fields of another index that already gets used. That is, the index(es?) on the other fields used in `hml`. (e.g. employeeid & historyid). This *might* make that part (much?) more restricting than it now is. As always, it all depends on the data and the logic behind it... – deroby May 05 '15 at 20:56
  • Just to give an update to everyone, I have asked the help of our DBA to check what is the best solution to the query. As it turns out, the query is running fine given that the parameter on the :ddate binding variable is a date which is relatively small (Say last 3 days). Since this query will run at least twice a day, the interval will be small from the last run time. We have concluded that this can be deployed now. thanks everyone for the support. – CaLoL May 08 '15 at 09:47

0 Answers0