Oracle 11.2 has a delay of 2 seconds for simple SQL at random times

Question

A simple table join is done usualy in 0.0XX seconds and sometimes in 2.0XX seconds (according to PL/SQL Developer SQL execution). It sill happens when running from SQL Plus.

If I run the SQL 10 times, 8 times it runns fine and 2 times in 2+ seconds.

It's a clean install of Oracle 11.2.0.4 for Linux x86_64 on Centos 7. I've installed Oracle recommended patches:

Patch 19769489 - Database Patch Set Update 11.2.0.4.5 (Includes CPUJan2015)
Patch 19877440 - Oracle JavaVM Component 11.2.0.4.2 Database PSU (Jan2015)

No change after patching.

The 2 tables have: LNK_PACK_REP: 13 rows PACKAGES: 6 rows

In SQL Plus i've enabled all statistics and runned the SQL multiple time. Only the time is changed from 0.1 to 2.1 from time to time. No other statistic is changed if I compare a run in 0.1 second with a run in 2.1 second. The server has 16 Gb RAM and 8 CPU core. Server load is under 0.1 (no user is using the server for the moment).

Output:

SQL> select PACKAGE_ID, id, package_name from LNK_PACK_REP LNKPR INNER JOIN PACKAGES P ON LNKPR.PACKAGE_ID = P.ID;

PACKAGE_ID ID PACKAGE_NAME

     3          3 RAPOARTE
     3          3 RAPOARTE
   121        121 VANZARI
   121        121 VANZARI
   121        121 VANZARI
     2          2 PACHETE
     2          2 PACHETE
     1          1 DEPARTAMENTE
     1          1 DEPARTAMENTE
    81         81 ROLURI
    81         81 ROLURI

PACKAGE_ID ID PACKAGE_NAME

   101        101 UTILIZATORI
   101        101 UTILIZATORI

13 rows selected.

Elapsed: 00:00:02.01

Execution Plan

Plan hash value: 2671988802

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name              | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                   |    13 |   351 |     3   (0)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR         |                   |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10002          |    13 |   351 |     3   (0)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN            |                   |    13 |   351 |     3   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE          |                   |     6 |    84 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|   5 |      PX SEND HASH       | :TQ10001          |     6 |    84 |     2   (0)| 00:00:01 |  Q1,01 | P->P | HASH       |
|   6 |       PX BLOCK ITERATOR |                   |     6 |    84 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|   7 |        TABLE ACCESS FULL| PACKAGES          |     6 |    84 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|   8 |     BUFFER SORT         |                   |       |       |            |          |  Q1,02 | PCWC |            |
|   9 |      PX RECEIVE         |                   |    13 |   169 |     1   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  10 |       PX SEND HASH      | :TQ10000          |    13 |   169 |     1   (0)| 00:00:01 |        | S->P | HASH       |
|  11 |        INDEX FULL SCAN  | UNQ_PACK_REP      |    13 |   169 |     1   (0)| 00:00:01 |        |      |            |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

3 - access("LNKPR"."PACKAGE_ID"="P"."ID")

Note

dynamic sampling used for this statement (level=2)

Statistics

     24  recursive calls
      0  db block gets
     10  consistent gets
      0  physical reads
      0  redo size
    923  bytes sent via SQL*Net to client
    524  bytes received via SQL*Net from client
      2  SQL*Net roundtrips to/from client
      4  sorts (memory)
      0  sorts (disk)
     13  rows processed

Table 1 structure:

-- Create table
create table PACKAGES
(
  id           NUMBER(3) not null,
  package_name VARCHAR2(150),
  position     NUMBER(3),
  activ        NUMBER(1)
)
tablespace UM
  pctfree 10
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );
-- Create/Recreate primary, unique and foreign key constraints 
alter table PACKAGES
  add constraint PACKAGES_ID primary key (ID)
  using index 
  tablespace UM
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );
-- Create/Recreate indexes 
create index PACKAGES_ACTIV on PACKAGES (ID, ACTIV)
  tablespace UM
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );

Table 2 structure:

-- Create table
create table LNK_PACK_REP
(
  package_id NUMBER(3) not null,
  report_id  NUMBER(3) not null
)
tablespace UM
  pctfree 10
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );
-- Create/Recreate primary, unique and foreign key constraints 
alter table LNK_PACK_REP
  add constraint UNQ_PACK_REP primary key (PACKAGE_ID, REPORT_ID)
  using index 
  tablespace UM
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );
-- Create/Recreate indexes 
create index LNK_PACK_REP_REPORT_ID on LNK_PACK_REP (REPORT_ID)
  tablespace UM
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    next 1M
    minextents 1
    maxextents unlimited
  );

In Oracle Enterprise Manager in SQL Monitor I can see the SQL that is runned multiple times. All runns have "Database Time" 0.0s (under 10 microsconds if I hover the list) and "Duration" 0.0s for normal run and 2.0s for thoose with delay. If I go to Monitored SQL Executions for that run of 2.0s I have:

Duration: 2.0s
Database Time: 0.0s
PL/SQL & Java: 0.0
Wait activity: % (no number here)
Buffer gets: 10
IO Requests: 0
IO Bytes: 0
Fetch calls: 2
Parallel: 4

Theese numbers are consistend with a fast run except Duration that is even smaller than Database Time (10,163 microseconds Database Time and 3,748 microseconds Duration) both dispalyed as 0.0s if no mouse hover.

I don't know what else to check.

Parallel query? I'd get that disabled on a table that small. — David Aldridge, Apr 10 '15 at 15:51
That's the default setting. I have no SQL hint nor I've changed any settings related to parallel execution. And the tables will grow. The Original SQL had 10 tables joined but I've manged to reduce this problem to only 2 table join with no change in outcome. — Cristi Cotet, Apr 10 '15 at 15:58
Have you verified the delay is on the database side and not in the SQL*Plus client? — Andrew Henle, Apr 10 '15 at 16:16
No problem running PQ on larger tables, but I'd try to eliminate it as the cause of this problem because it requires a pre-query execution checkpoint among other issues. — David Aldridge, Apr 10 '15 at 16:23
I assume you've checked the alert log to look for things like log writer waiting for a log switch, etc? — Mark Wagoner, Apr 10 '15 at 20:05

score 2 · Answer 1 · edited May 23 '17 at 12:06

Parallel queries cannot be meaningfully tuned to within a few seconds. They are designed for queries that process large amounts of data for a long time.

The best way to optimize parallel statements with small data sets is to temporarily disable it:

alter system set parallel_max_servers=0;

(This is a good example of the advantages of developing on workstations instead of servers. On a server, this change affects everyone and you probably don't even have the privilege to run the command.)

The query may be simple but parallelism adds a lot of complexity in the background.

It's hard to say exactly why it's slower. If you have the SQL Monitoring report the wait events may help. But even those numbers may just be generic waits like "CPU". Parallel queries have a lot of overhead, in expectation of a resource-intensive, long-running query. Here are some types of overhead that may explain where those 2 seconds come from:

Dynamic sampling - Parallelism may automatically cause dynamic sampling, which reads data from the tables. Although dynamic sampling used for this statement (level=2) may just imply missing optimizer statistics.
OS Thread startup - The SQL statement probably needs to start up 8 additional OS threads, and prepare a large amount of memory to hold all the intermediate data. Perhaps the parameter PARALLEL_MIN_SERVERS could help prevent some time used to create those threads.
Additional monitoring - Parallel statements are automatically monitored, which requires recursive SELECTs and INSERTs.
Caching - Parallel queries often read directly from disk and skip reading and writing into the buffer cache. The rules for when it caches data are complicated and undocumented.
Downgrading - Finding the correct degree of parallelism is complicated. For example, I've compiled a list of 39 factors that influence the DOP. It's possible that one of those is causing downgrading, making some queries fast and others slow.

And there are probably dozens of other types of overhead I can't think of. Parallelism is great for massively improving the run-time of huge operations. But it doesn't work well for tiny queries.

score 1 · Answer 2 · answered Apr 24 '15 at 14:57

1

The delay is due to parallelism as suggested by David Aldridge and Jon Heller but I don't agree the solution proposed by Jon Heller to disable parallelism for all queries (at system level). You can play with "alter session" to disable it and re-enable it before running big queries. The exact reason of the delay it's still unknown as the query finish fast in 8 out of 10 runs and I would expect a 10/10 fast run.

answered Apr 24 '15 at 14:57

Cristi Cotet

51
5

Setting it at the session level would be the least invasive way to manage parallelism. But keep in mind that statement hints can override the session values. If this is part of a huge program with multiple developers building large queries, it's probably inevitable that people will start throwing in parallel hints. – Jon Heller Apr 24 '15 at 17:52

Oracle 11.2 has a delay of 2 seconds for simple SQL at random times

Execution Plan

Predicate Information (identified by operation id):

Note

Statistics

2 Answers2