0

My Oracle query takes over 1.5 min and I do not know if it's because of inefficient query writing, bad choice of indexes or some other database issue that I cannot control.

Some tables and data were changed to protect IP.

SELECT /*+ PARALLEL (AUTO) */ COUNT(DISTINCT SUD_USERID)
FROM (
    SELECT /*+ PARALLEL (AUTO) */
           SUD_USERID ,
           CASE WHEN SCH_PAGETYPE = 'Page' AND SUD_EVENTTYPE = 'S'
               THEN 'EVENTTYPE1'
                WHEN SCH_PAGETYPE = 'Page' AND SUD_EVENTTYPE = 'V'
             THEN 'EVENTTYPE2'
             WHEN SCH_PAGETYPE = 'Hub' AND SUD_EVENTTYPE = 'S'
             THEN 'EVENTTYPE3'
        END AS CALC_EVENT_SOURCE,
        SUD_EVENT_SOURCE
        FROM
       (
            SELECT /*+ PARALLEL (AUTO) */
                     UPPER(PAGETYPE)|| '-' || SCH.ID PAGETYPE_ID ,
                SCH.PAGETYPE SCH_PAGETYPE
            FROM TABLE1 SCH
            WHERE   SCH.PAGETYPE IN ('Page', 'Hub')
                AND SCH.CATEGORY_NAME NOT IN ('archive', 'testcategory')
        )
        INNER JOIN (
            SELECT /*+ PARALLEL (AUTO) */
                DISTINCT SUD.TRACEID TRACEID ,
                SUD.EVENTTYPE SUD_EVENTTYPE ,
                SUD.USERID SUD_USERID,
                SUD.EVENT_SOURCE SUD_EVENT_SOURCE
            FROM
                SOMESCHEMA.USAGE_DETAILS SUD
            WHERE
                SUD.EVENTTYPE IN ('S', 'V')
        )
       ON TRACEID = PAGETYPE_ID
        INNER JOIN USER_JOB_FAMILY_MAPPING SFD
       ON SUD_USERID = SFD.USERID
       )
    WHERE CALC_EVENT_SOURCE = SUD_EVENT_SOURCE

I could not copy the text of the explain plan (generated via DBeaver) but here is a screenshot:

enter image description here

USAGE_DETAILS table has 3941810 rows
TABLE1 has 5908 rows
USER_JOB_FAMILY_MAPPING has 578233 rows

There are no keys on any of these tables. USAGE_DETAILS.TRACEID is VARCHAR2(500) NOT NULL has function index=SUBSTR("TRACEID",1,4) and another index declared as default but on that column.

USAGE_DETAILS.USERID is VARCHAR2(50) NOT NULL
USAGE_DETAILS.EVENTTYPE is VARCHAR2(10) NOT NULL and has default index
USAGE_DETAILS.EVENT_SOURCE is VARCHAR2(200) NOT NULL and has default index

I have tried doing inner joins on the full tables rather than the parenthetically generated (subselect?) tables, but that did not perform better and also limited my ability to use an alias in the WHERE clause.

I do not know what kind of machine this is running on, just that it's set up for development. I'd like this query to give me accurate answers in under 10s. Some times the query above though still does not return even after 10+ minutes.

Plan hash value: 2784166315

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                       | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                            |     1 |    27 |       |   258K  (1)| 00:00:11 |
|   1 |  SORT AGGREGATE                              |                            |     1 |    27 |       |            |          |
|   2 |   VIEW                                       | VM_NWVW_1                  |  1809K|    46M|       |   258K  (1)| 00:00:11 |
|   3 |    HASH GROUP BY                             |                            |  1809K|   745M|       |   258K  (1)| 00:00:11 |
|*  4 |     HASH JOIN                                |                            |  1809K|   745M|       |   258K  (1)| 00:00:11 |
|*  5 |      TABLE ACCESS FULL                       | TABLE1S                    |  5875 |   172K|       |   309   (0)| 00:00:01 |
|   6 |      MERGE JOIN SEMI                         |                            |  3079K|  1180M|       |   257K  (1)| 00:00:11 |
|   7 |       SORT JOIN                              |                            |  3079K|  1139M|       |   254K  (1)| 00:00:10 |
|   8 |        VIEW                                  |                            |  3079K|  1139M|       |   254K  (1)| 00:00:10 |
|   9 |         HASH UNIQUE                          |                            |  3079K|  1139M|  1202M|   254K  (1)| 00:00:10 |
|  10 |          INLIST ITERATOR                     |                            |       |       |       |            |          |
|  11 |           TABLE ACCESS BY INDEX ROWID BATCHED| USAGE_DETAILS              |  3079K|  1139M|       |    46   (0)| 00:00:01 |
|* 12 |            INDEX RANGE SCAN                  | IDX_UUD_EVENTTYPE          | 13704 |       |       |    46   (0)| 00:00:01 |
|* 13 |       SORT UNIQUE                            |                            |   578K|  7905K|    22M|  3558   (1)| 00:00:01 |
|  14 |        INDEX FAST FULL SCAN                  | USERID_IDX                 |   578K|  7905K|       |   704   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("TRACEID"=UPPER("PAGETYPE")||'-'||TO_CHAR("SCH"."ID"))
       filter("from$_subquery$_004"."SUD_EVENT_SOURCE"=CASE  WHEN (("SCH"."PAGETYPE"='Page') AND
              ("from$_subquery$_004"."SUD_EVENTTYPE"='S')) THEN 'EVENTTYPE1' WHEN (("SCH"."PAGETYPE"='Page') AND
              ("from$_subquery$_004"."SUD_EVENTTYPE"='V')) THEN 'EVENTTYPE2' WHEN (("SCH"."PAGETYPE"='Hub') AND
              ("from$_subquery$_004"."SUD_EVENTTYPE"='S')) THEN 'EVENTTYPE3' END )
   5 - filter("SCH"."CATEGORY_NAME"<>'archive' AND "SCH"."CATEGORY_NAME"<>'testcategory' AND ("SCH"."PAGETYPE"='Hub' OR
              "SCH"."PAGETYPE"='Page'))
  12 - access("SUD"."EVENTTYPE"='S' OR "SUD"."EVENTTYPE"='V')
  13 - access("from$_subquery$_004"."SUD_USERID"="SFD"."USERID")
       filter("from$_subquery$_004"."SUD_USERID"="SFD"."USERID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation

After running exec dbms_stats.gather_table_stats(ownname=>'SCHEMA1',tabname=>'USAGE_DETAILS'); exec dbms_stats.gather_table_stats(ownname=>'SCHEMA1',tabname=>'TABLE1');

I have this new plan:

SQL> select plan_table_output from table(dbms_xplan.display());
Plan hash value: 3419946982                                                                                                         
                                                                                                                                    
----------------------------------------------------------------------------------------------------------------                    
| Id  | Operation                 | Name                       | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                    
----------------------------------------------------------------------------------------------------------------                    
|   0 | SELECT STATEMENT          |                            |     1 |    27 |       | 70152   (1)| 00:00:03 |                    
|   1 |  SORT AGGREGATE           |                            |     1 |    27 |       |            |          |                    
|   2 |   VIEW                    | VM_NWVW_1                  | 53144 |  1401K|       | 70152   (1)| 00:00:03 |                    
|   3 |    HASH GROUP BY          |                            | 53144 |    21M|    21M| 70152   (1)| 00:00:03 |                    
|*  4 |     HASH JOIN RIGHT SEMI  |                            | 53144 |    21M|    14M| 65453   (1)| 00:00:03 |                    
|   5 |      INDEX FAST FULL SCAN | USERID_IDX                 |   578K|  7905K|       |   704   (1)| 00:00:01 |                    
|*  6 |      HASH JOIN            |                            | 53144 |    20M|       | 62995   (1)| 00:00:03 |                    
|   7 |       JOIN FILTER CREATE  | :BF0000                    |  5503 |   161K|       |   309   (0)| 00:00:01 |                    
|*  8 |        TABLE ACCESS FULL  | TABLE1                     |  5503 |   161K|       |   309   (0)| 00:00:01 |                    
|   9 |       VIEW                |                            |  3549K|  1259M|       | 62677   (1)| 00:00:03 |                    
|  10 |        HASH UNIQUE        |                            |  3549K|   159M|   203M| 62677   (1)| 00:00:03 |                    
|  11 |         JOIN FILTER USE   | :BF0000                    |  3549K|   159M|       | 21035   (1)| 00:00:01 |                    
|* 12 |          TABLE ACCESS FULL| USAGE_DETAILS              |  3549K|   159M|       | 21035   (1)| 00:00:01 |                    
----------------------------------------------------------------------------------------------------------------                    
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   4 - access("from$_subquery$_004"."SUD_USERID"="SFD"."USERID")                                                                    
   6 - access("TRACEID"=UPPER("PAGETYPE")||'-'||TO_CHAR("SCH"."ID"))                                                        
       filter("from$_subquery$_004"."SUD_EVENT_SOURCE"=CASE  WHEN (("SCH"."PAGETYPE"='Page') AND                                    
              ("from$_subquery$_004"."SUD_EVENTTYPE"='S')) THEN 'EVENTTYPE1' WHEN (("SCH"."PAGETYPE"='Page') AND                     
              ("from$_subquery$_004"."SUD_EVENTTYPE"='V')) THEN 'EVENTTYPE2' WHEN (("SCH"."PAGETYPE"='Hub') AND                            
              ("from$_subquery$_004"."SUD_EVENTTYPE"='S')) THEN 'EVENTTYPE3' END )                                                   
   8 - filter("SCH"."CATEGORY_NAME"<>'archive' AND "SCH"."CATEGORY_NAME"<>'testcategory' AND                                        
              ("SCH"."PAGETYPE"='Hub' OR "SCH"."PAGETYPE"='Page'))                                                                  
  12 - filter(("SUD"."EVENTTYPE"='S' OR "SUD"."EVENTTYPE"='V') AND                                                                  
              SYS_OP_BLOOM_FILTER(:BF0000,"SUD"."TRACEID"))                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation                                  

37 rows selected.

Is it more likely that the compute statistics massively helped this query or that someone did something else that I was not aware of? Yes, the query ran much better, but I'd feel better too if I knew why.

halfer
  • 19,824
  • 17
  • 99
  • 186
Woodsman
  • 901
  • 21
  • 61
  • 1
    Sorry, I had problems pasting on the initial ask, had to save multiple times to see what stackoverflow was complaining about. – Woodsman Jul 27 '19 at 00:53
  • I recommend you generate the execution plan with `explain plan for SELECT ...;` and then `select * from table(dbms_xplan.display);`. In general you should avoid all IDE execution plans, they always leave off information. In this case, I would be curious to see the "Note" section. It might include some information about parallelism. Parallelism is usually helpful for long-running queries. If you want under 10 second you probably won't want to use parallelism. – Jon Heller Jul 27 '19 at 01:56
  • And if you want to investigate even more, you might want to use the SQL monitoring tool. Find the relevant SQL_ID in `GV$SQL`, and then run `select dbms_sqltune.report_sql_monitor('your SQL_ID') from dual;` That result will show the estimated and actual values. With that information, we may be able to tell where Oracle's estimates were wrong. – Jon Heller Jul 27 '19 at 01:57
  • I added the explain plan for select under SqlPlus, spooled the results and added it to the log. – Woodsman Jul 27 '19 at 06:05
  • Check [here](https://stackoverflow.com/a/34975420/4808122) how to post the *execution plan* as a text. If you user SQLPlus you must set `SET LINESIZE` and `SET PAGESIZE` – Marmite Bomber Jul 27 '19 at 07:07
  • 1
    How many rows there are in `USAGE_DETAILS` with `EVENTTYPE IN ('S', 'V')`? Oracle thinks there are only `13704` such row, I guess there will be much more of them. If so **check your table statistics** and if the event types are skew distributed the usage of **histograms**. – Marmite Bomber Jul 27 '19 at 07:11
  • When using `dbms_xplan.display` in SQL\*Plus, the output is easier to read if you `set linesize 120 pagesize 999` as the default page dimensions are rather small, which is why the plan output above is scrambled. – William Robertson Jul 27 '19 at 17:29
  • Unrelated, but: I am pretty sure that the `parallel` hint in the sub-selects achieves nothing –  Aug 05 '19 at 16:56

2 Answers2

0

Hy,

after reviewing your SQL I noticed your statements are full of string comparisons and searches. For example

        SELECT /*+ PARALLEL (AUTO) */
                 UPPER(PAGETYPE)|| '-' || SCH.ID PAGETYPE_ID ,
            SCH.PAGETYPE SCH_PAGETYPE
        FROM TABLE1 SCH
        WHERE   SCH.PAGETYPE IN ('Page', 'Hub')
            AND SCH.CATEGORY_NAME NOT IN ('archive', 'testcategory')

This can be indexed int 2 ways. First: Create table that has 'Page', 'Hub', and other types that you need, create for the a column Index and then "replace" basically adapt your query to resolve those indexes instead of string compare. Tables can have multiple indices on columns those have to be treated with caution because they create problems in regards of database size. Also I would check if what are the biggest tables and reorder their selections to the last. Meaning:

if one table has 12 rows and the other 100. First put the 12 row table then the 100 row. This will multiply in your case since the tables and nested and chained.

I made 1 more review and realized I made an oversight.

USAGE_DETAILS table has 3941810 rows
TABLE1 has 5908 rows
USER_JOB_FAMILY_MAPPING has 578233 rows

First filter the table 1, this is costly already, then Inner join raw USAGE_DETAILS and then select the join ID-s. Then inner join USER_JOB_FAMILY_MAPPING, and select after that. The reason is that the joins are done on the ID which is probably int type.

Marko Bencik
  • 368
  • 2
  • 13
  • I can definitely appreciate what you said and would like to try to do that. However, some of these fields already have an index defined by someone else. Can I have multiple indexes? Also, the unfiltered table sizes I think are in the notes, but SCH should be much much less than SUD by row count. If I understand you, the first table it sees in the SQL should be smaller than the second on a join. – Woodsman Aug 05 '19 at 13:10
  • OK, got your point, I will adapt my answer to address your concerns. – Marko Bencik Aug 05 '19 at 16:39
0

Gather statistics on the relevant objects like this:

begin
    dbms_stats.gather_table_stats(ownname => user,         tabname => 'TABLE1');
    dbms_stats.gather_table_stats(ownname => 'SOMESCHEMA', tabname => 'USAGE_DETAILS');
end;
/

This line in the execution plan implies that one of the tables is missing statistics:

- dynamic statistics used: dynamic sampling (level=2)

Not all uses of dynamic sampling imply missing statistics, but level 2 is highly suspicious. That sampling level is usually intended to "Apply dynamic sampling to all unanalyzed tables."

Optimizer statistics are necessary for Oracle to make good execution plans. The algorithms and access paths for joining small amounts of data are different than the algorithms and access paths for joining large amounts of data. The optimizer statistics help Oracle estimate the size of the results and build good plans.

If this solves your problem, you should also investigate the root cause. Optimizer statistics should always be gathered manually after a large change, and automatically by the system every night. If you have a large ETL process that significantly changes a table, it should include a call to DBMS_STATS at the end. The database by default gathers stats at 10PM every night, unless a DBA foolishly disabled the autotask.

If that doesn't solve the problem, then regenerate the execution plan with actual numbers using DBMS_SQLTUNE or the GATHER_PLAN_STATISTICS_HINT. SQL tuning is about optimizing the operations. Your SQL statement has 14 operations, each of which is like a miniature program. We need to know which one of the operations is causing the problem. Finding actual cardinalities and actual run times, and comparing them to estimates, helps tremendously with diagnosing SQL problems.


How do we know that gathering stats was what fixed the performance?

We can't be 100% sure. But it's a safe bet that gathering statistics was responsible for the improvement, for several reasons.

Bad or missing statistics are responsible for a large percentage of all Oracle performance problems. Ask any DBA and they'll have plenty of stories about missing statistics.

The Note section changes strongly imply there are no other weird things happening behind the scenes. There are lots of tricks to silently fix queries, like SQL profiles, baselines, adaptive reoptimization, dynamic sampling (shows up in the first plan, but not the second one, because stats are better), etc. But if those tricks were used they would show up in the Note section.

Jon Heller
  • 34,999
  • 6
  • 74
  • 132
  • I ran the gather statistics on two tables and walked away because it was taking a long time. The next day the query ran in about 3s. This database is being controlled by some other part of my company and I have no idea if they did anything to it. However, I added a new explain plan. – Woodsman Aug 01 '19 at 13:55
  • I thank you immensely. I did need to add 3 other SQL queries, so my timing is back to about 12-14s, but I think that is another issue. Unfortunately they hit the same tables, so I'm guessing the compute statistics would not help that problem. – Woodsman Aug 05 '19 at 13:05
  • One correction to @JonHeller's comment above. The use of dynamic sampling does NOT mean that a table is missing statistics. – BobC Aug 14 '19 at 22:09
  • @BobC Thank you, I adjusted my answer to explain that it's only level 2 that is suspicious. – Jon Heller Aug 15 '19 at 00:20