Extremely Huge time take for executing my following query?

Question

I just make some queries for select data from my server. The query is:

SELECT 
    ROUND((SUM(clength)/1048576),2) AS logical_MB,
    ROUND((SUM(plength) /1048576),2) AS physical_compr_MB,
    ds_doc.archiveno,
    ds_arch.archiveid
FROM ECR.ds_comp,
    ECR.ds_doc,
    ECR.ds_arch
WHERE ds_comp.docidno=ds_doc.docidno
    AND ds_doc.archiveno =ds_arch.archiveno
GROUP BY ds_doc.archiveno,
    ds_arch.archiveid;

result what is expecting is :

   

 9708,24            9704,93         9   Vee3    0,009255342
    13140,55        12682,93        10  Vf5     0,012095385
    104533,94       89183,02        3   Mdf4    0,085051556
    72346,34        48290,63        7   Sds2    0,046053534

But this query almost take one day. Any idea for optimize this query please?

Tip of today: Always use modern, explicit `JOIN` syntax. Easier to write (without errors), easier to read and maintain, and easier to convert to outer join if needed! — jarlh, Apr 14 '21 at 14:21
https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/query-execution-plans.html#GUID-BE44BB45-C061-4F3A-8F18-4AAC5F74C68A — jarlh, Apr 14 '21 at 14:24
Verify the execution plan for this query. Check which indexes are used, and if these are the correct indexes. A day is very long, which make me wonder how large the dataset is you're working with? — DeMaki, Apr 14 '21 at 14:57
In addition to the execution lan, can you tell us what system this is running on (nbr of cpus, type of storage, how much ram)? which db version? and why there are 5 columns in your output while the query returns 4? — gsalem, Apr 14 '21 at 15:18
If you query 500 million records - yes it simply takes some time! — Wernfried Domscheit, Apr 14 '21 at 15:29
Record count and table size are very different things. 500M records of 10 numeric fields and 500M records of 100 varchar2(200) fields will give you different read performance. — astentx, Apr 14 '21 at 16:11
For large data volumes without highly specific selection criteria, indexes are usually a source of poor performance. If the [execution plan](https://www.williamrobertson.net/documents/explain-plan-utility.html) or [SQL Monitor report](https://sqlmaria.com/2017/08/01/getting-the-most-out-of-oracle-sql-monitor/) shows nested loops then that is most likely your problem. — William Robertson, Apr 14 '21 at 21:55

Marmite Bomber · Accepted Answer · 2021-05-03T15:11:01.223

You provide close to no information that is required to help with performance problem, so only a general checklist can be provided

Check the Query

The query does not qualify the columns clengthand plength so please check if they are defined in the table ds_comp - if not, maybe you do not need to join to this table at all...

Also I assume that docidno is a primary key of ds_doc and archiveno is PK of ds_arch. If not you query will work, but you will get a different result as you expect due to duplication caused by the join (this may also cause excesive elapsed time)!

Verify the Execution Plan

Produce the execution plan for your query in text form (to be able to post it) as follows

EXPLAIN PLAN  SET STATEMENT_ID = '<sometag>' into   plan_table  FOR
... your query here ...

SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', '<sometag>','ALL'));

Remember that you are joining complete tables (not only few rows for some ID), so if you see INDEX ACCESS or NESTED LOOP there is a problem that explains the long runtime.

You want to see only HASH JOIN and FULL TABLE SCAN in your plan.

Index Access

Contrary to some recommendations in other answers if you want to profit from Index definition you do not need indexes on join columns (as explained above). What you can do is to cover all required attributes in indexes and perform the query using only indexes and ommit the table access at all. This will help if the tables are bright, i.e. the row size is large.

This definition will be needed

 create index ds_comp_idx1 on ds_comp (docidno,clength,plength);
 create index ds_doc_idx1 on ds_doc (docidno,archiveno); 
 create index ds_arch_idx1 on ds_arch (archiveno,archiveid);

and you will receive this plan

----------------------------------------------------------------------------------------
| Id  | Operation               | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |              |  1119K|    97M|   908  (11)| 00:00:01 |
|   1 |  HASH GROUP BY          |              |  1119K|    97M|   908  (11)| 00:00:01 |
|*  2 |   HASH JOIN             |              |  1119K|    97M|   831   (3)| 00:00:01 |
|*  3 |    HASH JOIN            |              |  1001 | 52052 |     5   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN     | DS_ARCH_IDX1 |    11 |   286 |     1   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| DS_DOC_IDX1  |  1001 | 26026 |     4   (0)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN | DS_COMP_IDX1 |  1119K|    41M|   818   (2)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("C"."DOCIDNO"="D"."DOCIDNO")
   3 - access("D"."ARCHIVENO"="A"."ARCHIVENO")

Note the INDEX FULL SCAN and INDEX FAST FULL SCAN which means you are scanning the data from the index only and you do not need to perform the full table scan.

Use Parallel Option

With your rather simple query there is not much option to improve something. What works always is to deploy a parallel query using the /*+ PARALLEL(N) */ hint.

The precontition is that your database is configured for this option and you have hardware that can deploy it.

score 0 · Answer 2 · edited Apr 14 '21 at 16:57

Rewrite using explicit joins:

    SELECT 
        ROUND((SUM(clength)/1048576),2) AS logical_MB,
        ROUND((SUM(plength) /1048576),2) AS physical_compr_MB,
        d.archiveno,
        a.archiveid
    FROM ECR.ds_comp c
         INNER JOIN ECR.ds_doc d ON c.docidno=d.docidno
         INNER JOIN ECR.ds_arch a ON d.archiveno=a.archiveno
    GROUP BY d.archiveno,
        a.archiveid;

Check indexes exist on join columns c.docidno, d.docidno, d.archiveno, a.archiveno

Extremely Huge time take for executing my following query?

2 Answers2