2

Suppose I have one very large table, with 250M rows:

create table example_customers as
select dbms_random.string('x', 100) as first_name
     , dbms_random.string('x', 100) as last_name
  from dual
connect by level <= 250000000;
begin
  dbms_stats.gather_table_stats(user, 'example_customers');
end;

I want to do a full scan of this table, with a simple query: select count(*) from example_customers

PLAN_TABLE_OUTPUT
Plan hash value: 2907982153
 
----------------------------------------------------------------------------------------
| Id  | Operation                  | Name              | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                   |     1 |   384K  (1)| 00:00:16 |
|   1 |  SORT AGGREGATE            |                   |     1 |            |          |
|   2 |   TABLE ACCESS STORAGE FULL| EXAMPLE_CUSTOMERS |   250M|   384K  (1)| 00:00:16 |
----------------------------------------------------------------------------------------

I then decide that I want to use the parallel hint to try to speed this up, using select /*+parallel(10)*/ count(*) from example_customers

Plan hash value: 2126708148
 
-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name              | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                   |     1 | 42744   (1)| 00:00:02 |        |      |            |
|   1 |  SORT AGGREGATE                |                   |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR               |                   |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)         | :TQ10000          |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE             |                   |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR         |                   |   250M| 42744   (1)| 00:00:02 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS STORAGE FULL| EXAMPLE_CUSTOMERS |   250M| 42744   (1)| 00:00:02 |  Q1,00 | PCWP |            |
-------------------------------------------------------------------------------------------------------------------------

At first glance, this looks to be a big improvement - The total cost of the first statement is 384K, and the total cost of the new statement is 43K, which indicates to me that the cost of the parallel execution statement is taken as the highest of the 10 individual operations that run in parallel, plus some additional cost of the overhead of parallel exeuction.

When I actually run these statements, however, I do not see any time difference - both of them are taking approximately 35 seconds. Why is the second statement not faster?

I have some theories, but I don't know how to verify which of them (if any) is the reason:

  • Each of the 10 individual parallel operations is full scanning the table (as seen by the xplan listing the rows returned as 250m throughout)
  • The first query is already hitting the limit for how fast the database can read data across all operations, so the second query is still restricted by this same limit
  • The parallel operations are being forced into serial operations (in which case I would expect to see PX COORDINATOR FORCED SERIAL instead of PX COORDINATOR

Additional Information

The SQL monitor report for the query without the hint:

SQL Monitoring Report

SQL Text
------------------------------
select count(*) from example_customers

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                   
 Instance ID         :  1                                 
 Session             :  TESTING (2719:63368)              
 SQL ID              :  2v9j9nz9748xq                     
 SQL Execution ID    :  16777221                          
 Execution Started   :  07/27/2020 13:51:26               
 First Refresh Time  :  07/27/2020 13:51:30               
 Last Refresh Time   :  07/27/2020 13:52:01               
 Duration            :  35s                               
 Module/Action       :  PL/SQL Developer/SQL Window - New 
 Service             :  UCL                               
 Program             :  plsqldev.exe                      
 Fetch Calls         :  1                                 

Global Stats
=============================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Fetch | Buffer | Read  | Read  |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes | Elig Bytes | Returned Bytes |
=============================================================================================================
|      36 |      34 |     1.39 |        0.00 |     1 |     1M | 11086 |  11GB |       11GB |           11GB |
=============================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2907982153)
=======================================================================================================================================================================
| Id |          Operation           |       Name        |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read  | Read  |  Mem  | Activity | Activity Detail |
|    |                              |                   | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes | (Max) |   (%)    |   (# samples)   |
=======================================================================================================================================================================
|  0 | SELECT STATEMENT             |                   |         |      |        32 |     +4 |     1 |        1 |       |       |     . |          |                 |
|  1 |   SORT AGGREGATE             |                   |       1 |      |        32 |     +4 |     1 |        1 |       |       |     . |          |                 |
|  2 |    TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 385K |        36 |     +0 |     1 |     250M | 11086 |  11GB |   7MB |          |                 |
=======================================================================================================================================================================

The SQL monitor report for the query with the hint:

SQL Monitoring Report

SQL Text
------------------------------
select /*+parallel(10)*/ count(*) from example_customers

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                   
 Instance ID         :  1                                 
 Session             :  TESTING (1937:31657)              
 SQL ID              :  882dpyfj3hx4m                     
 SQL Execution ID    :  16777216                          
 Execution Started   :  07/27/2020 13:51:20               
 First Refresh Time  :  07/27/2020 13:51:24               
 Last Refresh Time   :  07/27/2020 13:51:55               
 Duration            :  35s                               
 Module/Action       :  PL/SQL Developer/SQL Window - New 
 Service             :  UCL                               
 Program             :  plsqldev.exe                      
 DOP Downgrade       :  100%                              
 Fetch Calls         :  1                                 

Global Stats
=============================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Fetch | Buffer | Read  | Read  |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes | Elig Bytes | Returned Bytes |
=============================================================================================================
|      36 |      35 |     1.32 |        0.00 |     1 |     1M | 11086 |  11GB |       11GB |           11GB |
=============================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2126708148)
============================================================================================================================================================================
| Id |            Operation             |       Name        |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  |  Mem  | Activity | Activity Detail |
|    |                                  |                   | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | (Max) |   (%)    |   (# samples)   |
============================================================================================================================================================================
|  0 | SELECT STATEMENT                 |                   |         |       |         1 |    +35 |     1 |        1 |       |       |     . |          |                 |
|  1 |   SORT AGGREGATE                 |                   |       1 |       |         1 |    +35 |     1 |        1 |       |       |     . |          |                 |
|  2 |    PX COORDINATOR                |                   |         |       |           |        |     1 |          |       |       |     . |          |                 |
|  3 |     PX SEND QC (RANDOM)          | :TQ10000          |       1 |       |         1 |    +35 |     1 |        0 |       |       |     . |          |                 |
|  4 |      SORT AGGREGATE              |                   |       1 |       |        32 |     +4 |     1 |        1 |       |       |     . |          |                 |
|  5 |       PX BLOCK ITERATOR          |                   |    250M | 42744 |        32 |     +4 |     1 |     250M |       |       |     . |          |                 |
|  6 |        TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 42744 |        36 |     +0 |     1 |     250M | 11086 |  11GB |   7MB |          |                 |
============================================================================================================================================================================

Even More Information

If the exadata smart scan is disabled:

select /*+ OPT_PARAM('cell_offload_processing' 'false') PARALLEL(10) */ count(*) from example_customers;

SQL Monitoring Report

SQL Text
------------------------------
select /*+ OPT_PARAM('cell_offload_processing' 'false') PARALLEL(10) */ count(*) from example_customers                            

Global Stats
==========================================================================================
| Elapsed |   Cpu   |    IO    | Application |  Other   | Fetch | Buffer | Read  | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs  | Bytes |
==========================================================================================
|      38 |      36 |     2.04 |        0.00 |     0.11 |     1 |     1M | 11086 |  11GB |
==========================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2126708148)
====================================================================================================================================================================
| Id |            Operation             |       Name        |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Activity | Activity Detail |
|    |                                  |                   | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes |   (%)    |   (# samples)   |
====================================================================================================================================================================
|  0 | SELECT STATEMENT                 |                   |         |       |         1 |    +38 |     1 |        1 |       |       |          |                 |
|  1 |   SORT AGGREGATE                 |                   |       1 |       |         1 |    +38 |     1 |        1 |       |       |          |                 |
|  2 |    PX COORDINATOR                |                   |         |       |           |        |     1 |          |       |       |          |                 |
|  3 |     PX SEND QC (RANDOM)          | :TQ10000          |       1 |       |         1 |    +38 |     1 |        0 |       |       |          |                 |
|  4 |      SORT AGGREGATE              |                   |       1 |       |        35 |     +4 |     1 |        1 |       |       |          |                 |
|  5 |       PX BLOCK ITERATOR          |                   |    250M | 42744 |        35 |     +4 |     1 |     250M |       |       |          |                 |
|  6 |        TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 42744 |        38 |     +1 |     1 |     250M | 11086 |  11GB |          |                 |
====================================================================================================================================================================

If I use a more complex query with a group by, then I see some improvement (71 seconds to 61), but still much less than I would expect:

SQL Monitoring Report

SQL Text
------------------------------
select substr(surname, 1, 1) , count(*) from example_customers group by substr(surname, 1, 1) order by substr(surname, 1, 1)                            

Global Stats
=============================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Fetch | Buffer | Read  | Read  |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes | Elig Bytes | Returned Bytes |
=============================================================================================================
|      71 |      70 |     1.39 |        0.00 |     1 |     1M | 11086 |  11GB |       11GB |           11GB |
=============================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=525074000)
=======================================================================================================================================================================
| Id |          Operation           |       Name        |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read  | Read  |  Mem  | Activity | Activity Detail |
|    |                              |                   | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes | (Max) |   (%)    |   (# samples)   |
=======================================================================================================================================================================
|  0 | SELECT STATEMENT             |                   |         |      |        66 |     +6 |     1 |        5 |       |       |     . |          |                 |
|  1 |   SORT GROUP BY              |                   |       4 | 393K |        69 |     +3 |     1 |        5 |       |       |  2048 |          |                 |
|  2 |    TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 385K |        71 |     +1 |     1 |     250M | 11086 |  11GB |   7MB |          |                 |
=======================================================================================================================================================================
SQL Monitoring Report

SQL Text
------------------------------
select /*+ parallel(10) */substr(surname, 1, 1) , count(*) from example_customers group by substr(surname, 1, 1) order by substr(surname, 1, 1)                        

Global Stats
=============================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Fetch | Buffer | Read  | Read  |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes | Elig Bytes | Returned Bytes |
=============================================================================================================
|      61 |      60 |     1.36 |        0.00 |     1 |     1M | 11086 |  11GB |       11GB |           11GB |
=============================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3312522119)
==============================================================================================================================================================================
| Id |             Operation              |       Name        |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  |  Mem  | Activity | Activity Detail |
|    |                                    |                   | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | (Max) |   (%)    |   (# samples)   |
==============================================================================================================================================================================
|  0 | SELECT STATEMENT                   |                   |         |       |           |        |     1 |          |       |       |     . |          |                 |
|  1 |   PX COORDINATOR                   |                   |         |       |           |        |     1 |          |       |       |     . |          |                 |
|  2 |    PX SEND QC (ORDER)              | :TQ10001          |       4 | 43519 |         1 |    +61 |     1 |        0 |       |       |     . |          |                 |
|  3 |     SORT GROUP BY                  |                   |       4 | 43519 |         1 |    +61 |     1 |        5 |       |       |  2048 |          |                 |
|  4 |      PX RECEIVE                    |                   |       4 | 43519 |           |        |     1 |          |       |       |     . |          |                 |
|  5 |       PX SEND RANGE                | :TQ10000          |       4 | 43519 |         1 |    +61 |     1 |        0 |       |       |     . |          |                 |
|  6 |        HASH GROUP BY               |                   |       4 | 43519 |        58 |     +4 |     1 |        5 |       |       |   3MB |          |                 |
|  7 |         PX BLOCK ITERATOR          |                   |    250M | 42771 |        58 |     +4 |     1 |     250M |       |       |     . |          |                 |
|  8 |          TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 42771 |        61 |     +1 |     1 |     250M | 11086 |  11GB |   7MB |          |                 |
==============================================================================================================================================================================

If we use a query with a lot of sort operations, then we see the parallel execution actually run slower:

SQL Monitoring Report

SQL Text
------------------------------
select first_name , surname , row_number() over (partition by first_name order by surname asc) rn1 , row_number() over (partition by first_name order by surname desc) rn2 , row_number() over (partition by surname order by first_name asc) rn3 , row_number() over (partition by surname order by first_name desc) rn4 from example_customers                     

Global Stats
=============================================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Fetch | Buffer | Read  | Read  | Write | Write |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes | Reqs  | Bytes | Elig Bytes | Returned Bytes |
=============================================================================================================================
|     838 |     688 |      150 |        0.00 |     1 |     1M | 59275 |  34GB | 58982 |  34GB |       11GB |           45GB |
=============================================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3818639180)
==================================================================================================================================================================================================
| Id |            Operation            |       Name        |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write |  Mem  | Temp  | Activity | Activity Detail |
|    |                                 |                   | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)    |   (# samples)   |
==================================================================================================================================================================================================
|  0 | SELECT STATEMENT                |                   |         |      |       246 |   +592 |     1 |      100 |       |       |       |       |     . |     . |          |                 |
|  1 |   WINDOW SORT                   |                   |    250M |   6M |       246 |   +592 |     1 |      100 |     1 |   1MB | 18233 |  11GB |   1GB |  11GB |          |                 |
|  2 |    WINDOW SORT                  |                   |    250M |   6M |       493 |   +345 |     1 |     250M | 19507 |  10GB | 14026 |  10GB |   1GB |  10GB |          |                 |
|  3 |     WINDOW SORT                 |                   |    250M |   6M |       473 |   +118 |     1 |     250M | 16086 |   8GB | 15230 |   8GB |   1GB |   8GB |          |                 |
|  4 |      WINDOW SORT                |                   |    250M |   6M |       346 |     +0 |     1 |     250M | 12595 |   6GB | 11493 |   6GB |   1GB |   6GB |          |                 |
|  5 |       TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 385K |       115 |     +2 |     1 |     250M | 11086 |  11GB |       |       |   7MB |     . |          |                 |
==================================================================================================================================================================================================
SQL Monitoring Report

SQL Text
------------------------------
select /*+ parallel(10) */ first_name , surname , row_number() over (partition by first_name order by surname asc) rn1 , row_number() over (partition by first_name order by surname desc) rn2 , row_number() over (partition by surname order by first_name asc) rn3 , row_number() over (partition by surname order by first_name desc) rn4 from example_customers                   

Global Stats
============================================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Fetch | Buffer | Read | Read  | Write | Write |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs | Bytes | Reqs  | Bytes | Elig Bytes | Returned Bytes |
============================================================================================================================
|     919 |     747 |      172 |        0.00 |     1 |     1M | 116K |  40GB | 72314 |  40GB |       11GB |           51GB |
============================================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2906577827)
============================================================================================================================================================================================================
| Id |                Operation                 |       Name        |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write |  Mem  | Temp  | Activity | Activity Detail |
|    |                                          |                   | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)    |   (# samples)   |
============================================================================================================================================================================================================
|  0 | SELECT STATEMENT                         |                   |         |       |         1 |   +435 |     1 |        0 |       |       |       |       |     . |     . |          |                 |
|  1 |   PX COORDINATOR                         |                   |         |       |           |        |     1 |          |       |       |       |       |     . |     . |          |                 |
|  2 |    PX SEND QC (RANDOM)                   | :TQ10003          |    250M |  672K |         1 |   +917 |     1 |        0 |       |       |       |       |     . |     . |          |                 |
|  3 |     WINDOW SORT                          |                   |    250M |  672K |       234 |   +684 |     1 |      100 |     1 |   1MB | 18234 |  11GB |   1GB |  11GB |          |                 |
|  4 |      WINDOW SORT                         |                   |    250M |  672K |       466 |   +452 |     1 |     250M | 19507 |  10GB | 16146 |  10GB |   1GB |  10GB |          |                 |
|  5 |       PX RECEIVE                         |                   |    250M |  672K |           |        |     1 |          |       |       |       |       |     . |     . |          |                 |
|  6 |        PX SEND HASH                      | :TQ10002          |    250M |  672K |       236 |   +447 |     1 |        0 |       |       |       |       |     . |     . |          |                 |
|  7 |         WINDOW SORT                      |                   |    250M |  672K |       480 |   +203 |     1 |     250M | 16086 |   8GB | 16015 |   8GB |   1GB |   8GB |          |                 |
|  8 |          PX RECEIVE                      |                   |    250M |  672K |           |        |     1 |          |       |       |       |       |     . |     . |          |                 |
|  9 |           PX SEND HASH                   | :TQ10001          |    250M |  672K |       245 |   +203 |     1 |        0 |       |       |       |       |     . |     . |          |                 |
| 10 |            WINDOW SORT                   |                   |    250M |  672K |       447 |     +1 |     1 |     250M | 69191 |  12GB | 21919 |  12GB |   1GB |   6GB |          |                 |
| 11 |             PX RECEIVE                   |                   |    250M | 42771 |           |        |     1 |          |       |       |       |       |     . |     . |          |                 |
| 12 |              PX SEND RANGE               | :TQ10000          |    250M | 42771 |       115 |     +4 |     1 |        0 |       |       |       |       |     . |     . |          |                 |
| 13 |               PX BLOCK ITERATOR          |                   |    250M | 42771 |       115 |     +4 |     1 |     250M |       |       |       |       |     . |     . |          |                 |
| 14 |                TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 42771 |       115 |     +4 |     1 |     250M | 11086 |  11GB |       |       |   7MB |     . |          |                 |
============================================================================================================================================================================================================```
EdG
  • 137
  • 1
  • 10
  • I will replicate your test case. It should go faster . give me some minutes for completion ;) – Roberto Hernandez Jul 27 '20 at 13:09
  • Show us real execution plans with statistics, not explain. And also if you have diagnostic and tuning packs, please provide real-time sql monitor report – Sayan Malakshinov Jul 27 '20 at 13:32
  • @EdG . . . What hardware are you running on? Do you have multiple compute nodes? Or are the threads all competing for the same resources? – Gordon Linoff Jul 27 '20 at 13:33
  • 1
    @SayanMalakshinov SQL monitor report for both queries added. – EdG Jul 27 '20 at 13:43
  • @GordonLinoff The database is running on two instances. `parallel_threads_per_cpu` is set to 2 and `cpu_count` is 8, so 10 parallel threads should be possible. The events for each session are mostly `cell smart table scan`, I don't think they're competing for the same resources (though I might not be looking in the right place) – EdG Jul 27 '20 at 13:49
  • you are running your queries on Exadata, as I can see `table access storage full`which it refers to Exadata's Smart Scan and cell offload capability. – Roberto Hernandez Jul 27 '20 at 15:27
  • @RobertoHernandez does running on Exadata interfere with using the parallel execution? – EdG Jul 27 '20 at 15:32
  • it does and a lot. Let me explain that in the answer – Roberto Hernandez Jul 27 '20 at 15:32
  • Pretty strange report: I don't see slaves' statistics. Looks like parallel execution was downgraded to serial. Have you checked pq_stats? – Sayan Malakshinov Jul 27 '20 at 19:00
  • @SayanMalakshinov do you mean gv$pq_tqstat? I don't see anything in there, either when the query is running or after it has finished – EdG Jul 27 '20 at 19:18
  • 1
    Second SQL Monitoring report shows DOP downgrade 100% –  Jul 27 '20 at 19:43
  • 1
    @Kumar that does sound like the culprit, is there a way of finding out why it's 100%? – EdG Jul 27 '20 at 19:56

2 Answers2

1

I test your scenario, so let me show you. I could not use CONNECT BY LEVEL because I ran out of memory, but I insert 50M records, which is enough for the comparison

SQL> select count(*) from test_perf.example_customers ;

  COUNT(*)
----------
   5000000

Elapsed: 00:00:02.15
SQL> desc test_perf.example_customers
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 C1                                                 VARCHAR2(120 CHAR)
 C2                                                 VARCHAR2(120 CHAR)

SQL>
SQL>
SQL> alter table test_perf.example_customers noparallel ;

Table altered.

Elapsed: 00:00:00.01
SQL>

Now, let's see both plans, with and without parallel

SQL> set autotrace traceonly explain
SQL> select count(*) from test_perf.example_customers ;
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 2907982153

--------------------------------------------------------------------------------
| Id  | Operation          | Name              | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                   |     1 | 24110   (1)| 00:00:02 |
|   1 |  SORT AGGREGATE    |                   |     1 |            |          |
|   2 |   TABLE ACCESS FULL| EXAMPLE_CUSTOMERS |  5000K| 24110   (1)| 00:00:02 |
--------------------------------------------------------------------------------

SQL> select /*+parallel(a,10) */ count(*) from test_perf.example_customers a
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 2126708148



 -----------------------------------------------------------------------------------------------------------------
    | Id  | Operation              | Name              | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
    -----------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |                   |     1 |  2677   (1)| 00:00:01 |        |      |            |
    |   1 |  SORT AGGREGATE        |                   |     1 |            |          |        |      |            |
    |   2 |   PX COORDINATOR       |                   |       |            |          |        |      |            |
    |   3 |    PX SEND QC (RANDOM) | :TQ10000          |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
    |   4 |     SORT AGGREGATE     |                   |     1 |            |          |  Q1,00 | PCWP |            |
    |   5 |      PX BLOCK ITERATOR |                   |  5000K|  2677   (1)| 00:00:01 |  Q1,00 | PCWC |            |
    |   6 |       TABLE ACCESS FULL| EXAMPLE_CUSTOMERS |  5000K|  2677   (1)| 00:00:01 |  Q1,00 | PCWP |            |
    -----------------------------------------------------------------------------------------------------------------
    
    Note
    -----
       - Degree of Parallelism is 10 because of table property
    
    SQL>

Now let's check what actually happens

SQL> set autotrace traceonly
SQL> select count(*) from test_perf.example_customers ;

Elapsed: 00:00:00.45

Execution Plan
----------------------------------------------------------
Plan hash value: 2907982153

--------------------------------------------------------------------------------
| Id  | Operation          | Name              | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                   |     1 | 24110   (1)| 00:00:02 |
|   1 |  SORT AGGREGATE    |                   |     1 |            |          |
|   2 |   TABLE ACCESS FULL| EXAMPLE_CUSTOMERS |  5000K| 24110   (1)| 00:00:02 |
--------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      68305  consistent gets
          0  physical reads
     358820  redo size
        542  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> set autotrace traceonly
SQL> select /*+parallel(a,10) */ count(*) from test_perf.example_customers a
  2  ;

Elapsed: 00:00:01.55

Execution Plan
----------------------------------------------------------
Plan hash value: 2126708148

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name              | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                   |     1 |  2677   (1)| 00:00:01 |        |      |            |
|   1 |  SORT AGGREGATE        |                   |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |                   |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000          |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |                   |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |                   |  5000K|  2677   (1)| 00:00:01 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| EXAMPLE_CUSTOMERS |  5000K|  2677   (1)| 00:00:01 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------

Note
-----
   - Degree of Parallelism is 10 because of table property


Statistics
----------------------------------------------------------
         31  recursive calls
          0  db block gets
      91667  consistent gets
      64207  physical reads
          0  redo size
        542  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Let's flush buffer cache and shared pool and try again

SQL> select count(*) from test_perf.example_customers ;

  COUNT(*)
----------
   5000000

Elapsed: 00:00:01.44
SQL> select /*+parallel(a,10) */ count(*) from test_perf.example_customers a
  2  ;

  COUNT(*)
----------
   5000000

Elapsed: 00:00:00.66
SQL>

With 100M records

SQL> select count(*) from test_perf.example_customers ;

  COUNT(*)
----------
  10000000

Elapsed: 00:00:11.00
SQL> select /*+parallel(a,16) */  count(*) from test_perf.example_customers a ;

  COUNT(*)
----------
  10000000

Elapsed: 00:00:02.71
SQL>

** UPDATE **

With PARALLEL_DEGREE_POLICY to AUTO

SQL> alter table test_perf.example_customers noparallel ;

Table altered.

SQL> alter session set parallel_degree_policy=auto ;

Session altered.

SQL> alter session enable parallel query;

Session altered.

SQL> set autotrace traceonly explain
SQL> select /*+parallel */ count(*) from test_perf.example_customers a ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2126708148

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name              | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                   |     1 | 13387   (1)| 00:00:02 |        |      |            |
|   1 |  SORT AGGREGATE        |                   |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |                   |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000          |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |                   |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |                   |  5000K| 13387   (1)| 00:00:02 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| EXAMPLE_CUSTOMERS |  5000K| 13387   (1)| 00:00:02 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 2
   - parallel scans affinitized for buffer cache

As you can see with 50M rows you notice the difference, and with 100M the difference is even greater. I can assure you that under normal circumstances, parallel execution will go faster in comparison with serializable queries the bigger the table is. Exadata is another world. Exadata's Smart Scan and cell offload capability runs in very very fast disks, and you don't notice the difference due to the high speed I/O. Key Osborne has an amazing article where it describes the amazing capabilities of the Storage Server in Exadata and also storage indexes.

http://kerryosborne.oracle-guy.com/2010/08/10/oracle-exadata-storage-indexes/

In 250M rows,you won't notice probably because the time Oracle expends in the creation of QC and slaves, and the communication between them ( which is memory ) is more or less the same that the Exadata storage layer is expending in reading the query.

Sorry, but I cannot test in Exadata ;)

Roberto Hernandez
  • 8,231
  • 3
  • 14
  • 43
  • I don't think it's due to exadata - when I disable the smart scan feature (using the OPT_PARAM('cell_offload_processing' 'false') hint), the parallel execution hint still doesn't save any time – EdG Jul 27 '20 at 16:57
  • can you share those execution plans with the hint ? did you flush your buffer cache before to run the test ? – Roberto Hernandez Jul 27 '20 at 17:00
  • I would try another kind of query, like group by or using some window function. that perhaps might give you a better insight – Roberto Hernandez Jul 27 '20 at 17:01
  • Will do and get back to you – EdG Jul 27 '20 at 17:06
  • More tests and examples have been added, it's not looking like the smart scan is the problem. – EdG Jul 27 '20 at 18:06
  • It is quite difficult to understand what is happening without a lot of more details. But It looks to me that in your new examples you are still accessing the smart cell cache of Exadata ;) I can't tell you more, I have no exadata but for the last advice -> don't use a specific degree, try set a session level PARALLEL_DEGREE_POLICY=AUTO` then use `/*+parallel(a) */ .... from table a` . Let Oracle decide which degree is the best for the query. – Roberto Hernandez Jul 27 '20 at 18:14
  • What additional details should I provide? If alter the session and use the parallel hint without specifying DOP, then the exeution plan doesn't use parallel execution, and the xplan says "Degree of Parallelism is 1 because of hint". – EdG Jul 27 '20 at 18:20
  • that is quite strange. If you specify degree to AUTO, then the hint should not have any degree , only parallel(a) , of course the a means the alias of the table. you must enable parallel query by `alter session enable parallel query`. I will show you by updating the answer – Roberto Hernandez Jul 27 '20 at 18:23
  • my bad, I made a mistake, it is parallel without anything, like I put in the answer – Roberto Hernandez Jul 27 '20 at 18:29
  • remove the alias. I was wrong. If you put the alias, it relays in the degree of the table, that is 1, then it runs in noparallel. Look up to my updated answer, the hint does not have any alias, just the word parallel – Roberto Hernandez Jul 27 '20 at 18:37
  • @EdG , can you at least upvote the answer ? not accept, just upvote – Roberto Hernandez Jul 27 '20 at 18:38
  • In this case it's still using one session, but it's giving a more helpful message: `automatic DOP: Computed Degree of Parallelism is 1` In V$PARAMETER, I can see that PARALLEL_DEGREE_LIMIT is CPU, which means that the default DOP should be (in my case) 32. – EdG Jul 27 '20 at 18:39
  • If you use PARALLEL_DEGREE_POLICY to MANUAL (Default) DOP will be calculated as CPU_COUNT*2. In my example, my server has 16 CPU , then DOP computes to 32. If I use PARALLEL_DEGREE_POLICY to AUTO, then DOP will be calculated automatically by Oracle, as long as I don't specify a fix degree and the table has parallel default or the session level is enabling parallel query and the hint parallel without degree is used – Roberto Hernandez Jul 27 '20 at 20:17
0

I found the root problem, the steps involved were:

  • The sql monitor report was reporting a 100% DOP downgrade for the attempts to use parallel execution
  • Using dbms_sqltune.report_sql_detail, I was able to get additional information on the steps in the execution plan
  • Using this, I was able to get the DOP downgrading reason for the PX COORDINATOR`operation (352)
  • 352 sometimes means an insufficient number of processes, but there are lots of availabile px processes
  • 352 sometimes means the resource manager has a maximum DOP limit
  • The resource manager has several plans, some of which have a parallel degree limit of 1 in dba_rsrc_plan_directives
  • By swapping consumer group for the session, the parallel hint works as expected
  • The execution time of the query with parallel execution goes from 35 seconds to less than 3 seconds

The code used:

begin
  dbms_resource_manager.switch_consumer_group_for_sess(session_id => 2690
                                                      ,session_serial => 25388
                                                      ,consumer_group => 'HIGH'
                                                      );
  
end;
select /*+ parallel */ count(*)
  from EXAMPLE_CUSTOMERS

The generated sql report:

SQL Monitoring Report

SQL Text
------------------------------
select /*+ parallel */ count(*) from EXAMPLE_CUSTOMERS

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)                   
 Instance ID         :  1                                 
 Session             :  TESTING (2690:25388)              
 SQL ID              :  cr2th8jckamab                     
 SQL Execution ID    :  16777219                          
 Execution Started   :  07/28/2020 00:05:19               
 First Refresh Time  :  07/28/2020 00:05:18               
 Last Refresh Time   :  07/28/2020 00:05:21               
 Duration            :  2s                                
 Module/Action       :  PL/SQL Developer/SQL Window - New 
 Service             :  UCL                               
 Program             :  plsqldev.exe                      
 Fetch Calls         :  1                                 

Global Stats
======================================================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Concurrency |  Other   | Fetch | Buffer | Read  | Read  |  Offload   |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs  | Bytes | Elig Bytes | Returned Bytes |
======================================================================================================================================
|      51 |      36 |     3.08 |        0.00 |        0.00 |       12 |     1 |     1M | 11137 |  11GB |       11GB |           11GB |
======================================================================================================================================

Parallel Execution Details (DOP=25 , Servers Allocated=25)
 Instances  : 2 

============================================================================================================================================================================================
| Instance |      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    | Application | Concurrency |  Other   | Buffer | Read | Read  |  Offload   |    Offload     | Wait Events |  |
|          |                |       |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes | Elig Bytes | Returned Bytes | (sample #)  |  |
============================================================================================================================================================================================
| 1        | PX Coordinator | QC    |         |    0.48 |    0.05 |          |        0.00 |             |     0.43 |      7 |      |     . |          . |              . |        NaN% |  |
| 1        | p000           | Set 1 |       1 |    1.97 |    1.39 |     0.12 |             |        0.00 |     0.47 |  56158 |  440 | 437MB |      437MB |          437MB |        NaN% |  |
| 1        | p001           | Set 1 |       2 |    2.09 |    1.45 |     0.13 |             |             |     0.51 |  56563 |  443 | 440MB |      440MB |          440MB |        NaN% |  |
| 1        | p002           | Set 1 |       3 |    1.94 |    1.38 |     0.11 |             |             |     0.45 |  54215 |  424 | 422MB |      422MB |          422MB |        NaN% |  |
| 1        | p003           | Set 1 |       4 |    2.05 |    1.19 |     0.11 |             |        0.00 |     0.74 |  44952 |  355 | 350MB |      350MB |          350MB |        NaN% |  |
| 1        | p004           | Set 1 |       5 |    2.09 |    1.47 |     0.13 |             |             |     0.49 |  57279 |  448 | 446MB |      446MB |          446MB |        NaN% |  |
| 1        | p005           | Set 1 |       6 |    2.09 |    1.41 |     0.11 |             |        0.00 |     0.57 |  54826 |  428 | 427MB |      427MB |          427MB |        NaN% |  |
| 1        | p006           | Set 1 |       7 |    2.13 |    1.16 |     0.10 |             |             |     0.87 |  46373 |  363 | 361MB |      361MB |          361MB |        NaN% |  |
| 1        | p007           | Set 1 |       8 |    2.18 |    1.18 |     0.13 |             |             |     0.87 |  46237 |  361 | 360MB |      360MB |          360MB |        NaN% |  |
| 1        | p008           | Set 1 |       9 |    2.10 |    1.39 |     0.13 |             |             |     0.59 |  55364 |  433 | 431MB |      431MB |          431MB |        NaN% |  |
| 1        | p009           | Set 1 |      10 |    2.11 |    1.48 |     0.12 |             |        0.00 |     0.51 |  59575 |  465 | 464MB |      464MB |          464MB |        NaN% |  |
| 1        | p00a           | Set 1 |      11 |    2.02 |    1.21 |     0.10 |             |             |     0.72 |  45981 |  360 | 358MB |      358MB |          358MB |        NaN% |  |
| 1        | p00b           | Set 1 |      12 |    1.98 |    1.20 |     0.11 |             |             |     0.67 |  48583 |  382 | 378MB |      378MB |          378MB |        NaN% |  |
| 1        | p00c           | Set 1 |      13 |    2.03 |    1.36 |     0.13 |             |             |     0.54 |  53580 |  421 | 417MB |      417MB |          417MB |        NaN% |  |
| 1        | p00d           | Set 1 |      14 |    2.17 |    1.22 |     0.10 |             |             |     0.85 |  48413 |  380 | 377MB |      377MB |          377MB |        NaN% |  |
| 1        | p00e           | Set 1 |      15 |    2.04 |    1.17 |     0.10 |             |             |     0.77 |  47261 |  370 | 368MB |      368MB |          368MB |        NaN% |  |
| 1        | p00f           | Set 1 |      16 |    2.07 |    1.18 |     0.11 |             |             |     0.79 |  46365 |  363 | 361MB |      361MB |          361MB |        NaN% |  |
| 2        | p000           | Set 1 |      17 |    1.89 |    1.52 |     0.12 |             |             |     0.25 |  61265 |  481 | 477MB |      477MB |          477MB |        NaN% |  |
| 2        | p001           | Set 1 |      18 |    1.91 |    1.68 |     0.14 |             |        0.00 |     0.09 |  66176 |  519 | 515MB |      515MB |          515MB |        NaN% |  |
| 2        | p002           | Set 1 |      19 |    1.87 |    1.51 |     0.14 |             |             |     0.22 |  66672 |  522 | 519MB |      519MB |          519MB |        NaN% |  |
| 2        | p003           | Set 1 |      20 |    1.99 |    1.66 |     0.14 |             |        0.00 |     0.20 |  64478 |  503 | 502MB |      502MB |          502MB |        NaN% |  |
| 2        | p004           | Set 1 |      21 |    1.96 |    1.70 |     0.15 |             |             |     0.12 |  70023 |  551 | 545MB |      545MB |          545MB |        NaN% |  |
| 2        | p005           | Set 1 |      22 |    1.94 |    1.68 |     0.15 |             |             |     0.11 |  63342 |  496 | 493MB |      493MB |          493MB |        NaN% |  |
| 2        | p006           | Set 1 |      23 |    1.89 |    1.68 |     0.14 |             |             |     0.08 |  72179 |  565 | 562MB |      562MB |          562MB |        NaN% |  |
| 2        | p007           | Set 1 |      24 |    2.07 |    1.74 |     0.14 |             |             |     0.19 |  71812 |  562 | 559MB |      559MB |          559MB |        NaN% |  |
| 2        | p008           | Set 1 |      25 |    1.97 |    1.57 |     0.14 |             |             |     0.26 |  64243 |  502 | 500MB |      500MB |          500MB |        NaN% |  |
============================================================================================================================================================================================

Instance Drill-Down
===========================================================================================================================================================================================================================================
| Instance | Process Names                                                                      | Elapsed |   Cpu   |    IO    | Application | Concurrency |  Other   | Buffer | Read | Read  |  Offload   |    Offload     | Wait Events |
|          |                                                                                    | Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes | Elig Bytes | Returned Bytes |             |
===========================================================================================================================================================================================================================================
|    1     | QC p000 p001 p002 p003 p004 p005 p006 p007 p008 p009 p00a p00b p00c p00d p00e p00f |      34 |      21 |     1.82 |        0.00 |        0.00 |       11 |   822K | 6436 |   6GB |        6GB |            6GB |             |
|    2     | p000 p001 p002 p003 p004 p005 p006 p007 p008                                       |      18 |      15 |     1.26 |             |        0.00 |     1.52 |   600K | 4701 |   5GB |        5GB |            5GB |             |
===========================================================================================================================================================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2126708148)
============================================================================================================================================================================
| Id |            Operation             |       Name        |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  |  Mem  | Activity | Activity Detail |
|    |                                  |                   | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | (Max) |   (%)    |   (# samples)   |
============================================================================================================================================================================
|  0 | SELECT STATEMENT                 |                   |         |       |         1 |     +2 |     1 |        1 |       |       |     . |          |                 |
|  1 |   SORT AGGREGATE                 |                   |       1 |       |         1 |     +2 |     1 |        1 |       |       |     . |          |                 |
|  2 |    PX COORDINATOR                |                   |         |       |         2 |     +1 |    26 |       25 |       |       |     . |          |                 |
|  3 |     PX SEND QC (RANDOM)          | :TQ10000          |       1 |       |         2 |     +1 |    25 |       25 |       |       |     . |          |                 |
|  4 |      SORT AGGREGATE              |                   |       1 |       |         2 |     +1 |    25 |       25 |       |       |     . |          |                 |
|  5 |       PX BLOCK ITERATOR          |                   |    250M | 17097 |         2 |     +1 |    25 |     250M |       |       |     . |          |                 |
|  6 |        TABLE ACCESS STORAGE FULL | EXAMPLE_CUSTOMERS |    250M | 17097 |         3 |     +0 |   379 |     250M | 11137 |  11GB | 165MB |          |                 |
============================================================================================================================================================================
EdG
  • 137
  • 1
  • 10