Optimized way to select and update large tables

Question

I have to select set of values by joining set of large tables and then update another large table from selected values. I am follow the below approach currently. But I see a performance drawback. What are the alternative ways available to do above kind of a work ?

DB server : Oracle EE

DECLARE
  CURSOR c1
  IS
    SELECT update_data FOR UPDATE OF abc;
BEGIN
  FOR update_data IN c1
  LOOP
    UPDATE klm
    SET klm.xyz  = update_data.cdf
    WHERE update_data.abc = klm.abc;
  END LOOP;
  COMMIT;
END;

score 1 · Answer 1 · answered Nov 05 '13 at 07:01

1

Most likely simple update will perform much better.

You can try with:

update klm t1
set xyz = ( select cdf from update_data t2 where t2.abc = t1.abc ) 
where exists ( select 1 from update_data t2 where t2.abc = t2.abc );

commit;

or if possible (has PK or unique index on update_data.abc )

update ( select t1.xyz, t2.cdf from klm t1, update_data t2 where t1.abc = t2.abc ) 
) set xyz = cdf; 

commit;

answered Nov 05 '13 at 07:01

igr

3,409
1
20
25

Thanks for answer. I tried second approach but got an error SQL Error: ORA-01779: cannot modify a column which maps to a non key-preserved table – era Nov 05 '13 at 10:54
And also I can't understand the where exists part of the first solution. Why you check t2.abc = t2.abc ? Thanks – era Nov 05 '13 at 11:21
In relate to first answer; The update_data is again a generated one. I do not think it's a best option to execute update_data SELECT query twice (in the SET part and in the WHERE EXIST ). Thanks – era Nov 05 '13 at 11:41
When I try the first solution I got ORA-01427: single-row subquery returns more than one row. update_data returns a thousand of rows to update. thanks – era Nov 05 '13 at 11:50
select in SET should return only one line- you can change value from *exactly* one row only. – igr Nov 05 '13 at 12:35
You can remove EXISTS clause in 1st SQL only if all lines in first table will find value in source table - otherwise it would be set to NULL when no line is found in source table – igr Nov 05 '13 at 12:37
Thanks igr. My SELECT returns all the rows that need to update. Update row by row will not work as my SELECT take reasonable time. – era Nov 05 '13 at 13:08

Kombajn zbożowy · Answer 2 · 2013-11-07T08:11:23.180

If you have performance issues with looping over each record, but the table is too big for a single update, you may consider updating in batches using BULK INTO ... LIMIT and FORALL.

CREATE TABLE klm (abc INTEGER, xyz INTEGER);
CREATE TABLE update_data (abc INTEGER, cdf INTEGER);

-- Have pairs of numbers (1000 rows)
INSERT INTO klm SELECT rownum, rownum FROM dual CONNECT BY level <= 1000;
-- Update every second row with 9999
INSERT INTO update_data SELECT rownum * 2, 9999 FROM dual CONNECT BY level <= 500;

DECLARE
  CURSOR c1
  IS
    -- Select the key to be updated and the new value
    SELECT abc, cdf FROM update_data;
  -- Table type and table variable to store rows fetched from the cursor
  TYPE t_update IS TABLE OF c1%rowtype;
  update_tab t_update;
BEGIN
  OPEN c1;
  LOOP
    -- Fetch next 30 rows into update table
    FETCH c1 BULK COLLECT INTO update_tab LIMIT 30;
    -- Exit when there were no more rows fetched
    EXIT WHEN update_tab.count = 0;
    -- This is the key point; uses update_tab to bulk-bind UPDATE statement
    -- and run it for 30 rows in a single context switch
    FORALL i IN 1..update_tab.count
      UPDATE klm
      SET klm.xyz  = update_tab(i).cdf
      WHERE update_tab(i).abc = klm.abc;
    COMMIT;
  END LOOP;
  CLOSE c1;
END;
/

The rationale behind this is that Oracle actually has separate engines running SQL statements and PL/SQL programs. Whenever a procedure encounters an SQL statement, it hands it over to SQL engine for execution. This is called "context switch" and takes a significant amount of time, especially when done in a loop.

Bulk-binding aims to reduce this overhead by doing the context switch only once per [bulk size] records. Again, this is certainly not as effective as a single DML operation, but for large tables or complex queries it may be best feasible solution.

I've used above method to update tables with 100M-500M records with batch size of 10K-100K and it worked fine. But you need to experiment with batch size in your environment for best performance.

Thx Kombajn. I can understand the LOOP. But not clear the select part. Can you bit explain the selection logic ? — era, Nov 07 '13 at 06:00
SELECT statement just pick the key (abc) and new value (cdf) from update_data table if that's what you mean. Anyway, I added some comments, I hope it will be more clear now. Also take a look at [Oracle Base artice on bulk-bind](http://www.oracle-base.com/articles/9i/bulk-binds-and-record-processing-9i.php) — Kombajn zbożowy, Nov 07 '13 at 08:04

Optimized way to select and update large tables

2 Answers2