Merge is taking time to update

Question

I am trying to update one column CODE from table TAB1 using REF_TAB table having 200 mn and 30 mn records respectively, but it is taking 3-4 hrs to update the records.

TAB1  -- 200000000(200mn)
ID VARCHAR2(10)  PK on ID
T_ID VARCHAR2(10) --- duplicate values
X_CD VARCHAR2(2)
CODE VARCHAR2(2) -- list partition with 30 codes

REF_TAB --30000000(30 mn)
T_ID VARCHAR2(10) --- unique values
CODE VARCHAR2(2)

The unique index on TAB1.ID primary key is the only index.

I have tried MERGE. Is there any other way to improve the performance of the query or by any other means?

MERGE /*+ parallel(A,4) append */ INTO TAB1 A
USING (SELECT T_ID,CODE FROM REF_TAB) R
ON (    A.T_ID = R.T_ID
        AND A.X_CD IS NOT NULL
    )
WHEN MATCHED THEN
UPDATE
SET A.CODE = R.CODE;

COMMIT;

with append
Plan hash value: 3594611329

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT           |                         |    93M|   714M|   494K  (4)| 00:00:03 |       |       |        |      |            |
|   1 |  MERGE                    | TAB1           |       |       |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR          |                         |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10001                |    93M|    27G|   494K  (4)| 00:00:03 |       |       |  Q1,01 | P->S | QC (RAND)  |
|   4 |     VIEW                  |                         |       |       |            |          |       |       |  Q1,01 | PCWP |            |
|*  5 |      HASH JOIN            |                         |    93M|    27G|   494K  (4)| 00:00:03 |       |       |  Q1,01 | PCWP |            |
|   6 |       PX RECEIVE          |                         |    32M|   374M|  8430   (5)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   7 |        PX SEND BROADCAST  | :TQ10000                |    32M|   374M|  8430   (5)| 00:00:01 |       |       |  Q1,00 | S->P | BROADCAST  |
|   8 |         PX SELECTOR       |                         |       |       |            |          |       |       |  Q1,00 | SCWC |            |
|   9 |          TABLE ACCESS FULL| REF_TAB                 |    32M|   374M|  8430   (5)| 00:00:01 |       |       |  Q1,00 | SCWP |            |
|  10 |       PX BLOCK ITERATOR   |                         |    93M|    26G|   484K  (3)| 00:00:03 |     1 |    35 |  Q1,01 | PCWC |            |
|* 11 |        TABLE ACCESS FULL  | TAB1                    |    93M|    26G|   484K  (3)| 00:00:03 |     1 |    35 |  Q1,01 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------------------


  without append

 Plan hash value: 3594611329

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT           |                         |   103M|   790M|   494K  (4)| 00:00:03 |       |       |        |      |            |
|   1 |  MERGE                    | TAB1                    |       |       |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR          |                         |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10001                |   103M|    30G|   494K  (4)| 00:00:03 |       |       |  Q1,01 | P->S | QC (RAND)  |
|   4 |     VIEW                  |                         |       |       |            |          |       |       |  Q1,01 | PCWP |            |
|*  5 |      HASH JOIN            |                         |   103M|    30G|   494K  (4)| 00:00:03 |       |       |  Q1,01 | PCWP |            |
|   6 |       PX RECEIVE          |                         |    32M|   374M|  8430   (5)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   7 |        PX SEND BROADCAST  | :TQ10000                |    32M|   374M|  8430   (5)| 00:00:01 |       |       |  Q1,00 | S->P | BROADCAST  |
|   8 |         PX SELECTOR       |                         |       |       |            |          |       |       |  Q1,00 | SCWC |            |
|   9 |          TABLE ACCESS FULL| REF_TAB                 |    32M|   374M|  8430   (5)| 00:00:01 |       |       |  Q1,00 | SCWP |            |
|  10 |       PX BLOCK ITERATOR   |                         |   103M|    29G|   485K  (4)| 00:00:03 |     1 |    35 |  Q1,01 | PCWC |            |
|* 11 |        TABLE ACCESS FULL  | TAB1                    |   103M|    29G|   485K  (4)| 00:00:03 |     1 |    35 |  Q1,01 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------------------

Tuning Oracle (or any database) queries requires lots of specific information which you haven't provided, such as indexes, data skew, load on the database. Please read [this SO answer](https://stackoverflow.com/a/34975420/146325) which explains how to ask such questions. Otherwise we're just guessing and you're no nearer to getting a helpful answer. — APC, Nov 05 '19 at 07:16
There is only one unique index on ID col which i have mentioned with the total record count for each table, — kashi, Nov 05 '19 at 07:21
You say `T_ID VARCHAR2(10) --- duplicate values`. So is `A.T_ID` distinct when `A.X_CD IS NOT NULL`? — APC, Nov 05 '19 at 07:24
not necessary , there can be multiple records with A.X_CD is not null in TAB1 and it has to update all those records.. — kashi, Nov 05 '19 at 07:29
@APC correct , I have modified and added the plan with and without append, — kashi, Nov 05 '19 at 08:13
@APC Sorry about missing information ..REF_TAB is a permanent table and this update will run daily . Nearly 50% values match to TAB1.T_ID. Nearly 20% have null X_CD. — kashi, Nov 05 '19 at 09:36

Radagast81 · Answer 1 · 2019-11-05T09:27:11.620

0

1) Add indices

You can add an index on column T_ID in table tab1. That should speed up the statement a lot. Or what also might help as you say T_ID in REF_TAB is unique if you make that column the primary key that should also help.

In both cases you should get rid of one full table scan and replace it with an index or index range scan, which should be faster.

2) Create table solution

One other solution that should be faster is to create a new table tab2:

create table tab2 as
SELECT t.ID
     , T_ID
     , t.X_CD
     , r.CODE
  FROM tab1 t
  JOIN ref_tab r USING(T_ID)

Add the indices, grants, constraints to tab2, drop tab1 and then rename tab2 to tab1. This way you don't store rollback information, which is the main resource problem when doing mass updates.

3) Use Foreign Key + Join instead of duplicate data (DB normalization)

You might also consider to add a foreign key on tab1:

ALTER TABLE tab1
ADD CONSTRAINT tab1_fk FOREIGN KEY (T_ID)
REFERENCES REF_TAB (T_ID)

And then drop CODE column from tab1 and read the value per JOIN - so you don't have to update the value, when it is changed.

edited Nov 05 '19 at 09:27

answered Nov 05 '19 at 08:51

Radagast81

2,921
1
7
21

The statement **must** read and apply all the rows in REF_TAB. So there is no performance benefit from an index on that table (I agree uniqueness should be enforced with a primary key constraint but that's a different matter). – APC Nov 05 '19 at 09:01
Do you think BULK Update will help in this case instead of MERGE? – kashi Nov 05 '19 at 09:04
@APC I'm not to sure about that. As there is no insert statement in the merge, you only have to consider records allready present in tab1, so you can loop over that records and read the code via index on REF_TAB (meaning FULL TABLE scan on tab1 and INDEX scan on REF_TAB). Not sure tho if optimizer is aware of that. – Radagast81 Nov 05 '19 at 09:12
@Radagast81 I have tried the your second solution by creating the intermediate table TEMP same as TAB1 and then inserting into TAB1 using TEMP LOJ with REF_TAB ON T_ID but didnt help much.. can we think of bulk update as an alternative. – kashi Nov 05 '19 at 09:46
@kashi You can try that but i don't think that really makes a difference until you have a unique index on REF_TAB. – Radagast81 Nov 05 '19 at 09:50
I cant alter TAB1 but yes i can alter REF_TAB to add constraint..let me give it a try,Thanks – kashi Nov 05 '19 at 09:55

Merge is taking time to update

1 Answers1