1

Person table have 2 Million rows and 3 Million rows in Finance and on HR schema respectively. Now I'm going to update the Person status on Finance schema with the person_status on the HR schema. Query is running on M3000 Server with 32 GB RAM with Solaris 10 and Oracle 11.2.2. It took 117 hours and still runnning. How can I optimize this query. I have created index on person_no.

DECLARE
  CURSOR c IS  
    SELECT p1.person_no     AS "PersonNo1"
         , p2.person_no     AS "PersonNo2"
         , p2.person_status AS "P2_PERSON_STATUS"
      FROM person    p1
         , hr.person p2    
     WHERE (lower(p1.person_no) = lower(p2.person_no)           
           OR substr(p1.person_no, instr(p1.person_no, '-') + 1, length(p1.person_no))
              = substr(p2.person_no, instr(p2.person_no, '-') + 1, length(p2.person_no)));
BEGIN

  FOR i IN c LOOP  
    UPDATE person    
       SET person_status_id = decode(lower(i.P2_PERSON_STATUS), 'y', 1, 'n', 2)    
     WHERE (lower(person_no) = lower(i.PersonNo2)           
        OR substr(person_no, instr(person_no, '-') + 1, length(person_no))
           = substr(i.PersonNo2, instr(i.PersonNo2, '-') + 1, length(i.PersonNo2)));  
    COMMIT;  
  END LOOP;

END;
/
bilalhaider
  • 181
  • 2
  • 5
  • 15
  • 1
    First, you don't have `i.person_no` column. In your cursor there is only `PersonNo1` or `p2` column. Second, you already checked `substr(p1.person_no,instr(....` in your cursor, why are you doing it for the second time in your update? Which is the type of `person_no` column? – Tatiana Nov 16 '15 at 08:26
  • Can you have do a query plan ? To make sure you are readubg the index you have created – Moudiz Nov 16 '15 at 08:28
  • Actually, why at all do you need pl/sql function? It looks like you can use only SQL. For example, MERGE function – Tatiana Nov 16 '15 at 08:28
  • 1
    the index on person_no cannot be used, because `lower`and `substr`functions on this column prevent this. – Frank Ockenfuss Nov 16 '15 at 08:36
  • 3
    This update can be done by a simple update statement. pl/sql is not required. in your for loop you update the whole table as already selected by your cursor. this doesn't make sense. – Frank Ockenfuss Nov 16 '15 at 08:44
  • What you might find here is the your querys wil suddenly fail with a `snapshot too old` error because you are commiting on every row modified inside a cursor. That commit is synchronous, so the query has to wait 2,000,000 times for it to complete. So first of al that should not have been there. As Frank says, a single SQL statement should be all you need anyway. – David Aldridge Nov 16 '15 at 09:52
  • @Tatiana It was a typo erro as query was running on server not connected to internet. I typed directly. On finance schema person_no column is a number and on HR schema it is varchar2(1); In cursor i got all the matching rows and in loop i am updating the records where person_no of finance schema matches with person_no of hr schema. – bilalhaider Nov 16 '15 at 10:12
  • @FrankOckenfuss how can i optimize my loop? – bilalhaider Nov 16 '15 at 10:13
  • Try to commit out of loop as that will be in last. – Muhammad Muazzam Nov 16 '15 at 10:16
  • You're also effectively finding the row in the person table that you want to update, and then not using that information when you run the update. How about selecting ROWID in the cursor and then using that to update the person table? Still a single SQL statement would be faster. – David Aldridge Nov 16 '15 at 10:16
  • 1
    1. You can actually use a simple `UPDATE` statement (no need to use PL/SQL and all your loop). 2. To use indexes on your tables, you could try creating [Function Based Indexes](https://oracle-base.com/articles/8i/function-based-indexes) , also see [When to use Function Base Indexes](https://docs.oracle.com/cd/E11882_01/appdev.112/e41502/adfns_indexes.htm#ADFNS005). – AndrewMcCoist Nov 16 '15 at 10:22

2 Answers2

2

There is several things you can do to optimize this. It would be helpful to see a query plan.

You can use a merge statement. Then you have only one statement and you can work optimizing this statement. Something like this

merge into person 
using hr.person person_hr 
on (person.person_no=person_hr.person_no) 
when matched then 
update set person_status_id=decode(lower(person_status),'y',1,'n',2);

You have to adjust the on part to match your where statement.

This might then be the source of best optimization. Maybe you have to create an index for lower and also for substr.

Something like

CREATE INDEX person_idx
 ON person (lower(person_no))

Also of course for the substrings etc. Hope this helps.

hol
  • 8,255
  • 5
  • 33
  • 59
  • I will incorporate the above suggestions and will update you on the query performance. – bilalhaider Nov 16 '15 at 10:17
  • I killed the session of last query. And modified the query as suggested. Created Function based indexes. It had duplicate rows in target schema so i got ORA-30926: unable to get a stable set of rows in the source tables. I removed the duplicate rows from target schema as suggested in http://stackoverflow.com/questions/2337271/ora-30926-unable-to-get-a-stable-set-of-rows-in-the-source-tables but still ORA-30926 . Task is still pending. – bilalhaider Nov 19 '15 at 05:22
0

In the code provided as question i can see there are two aspects.

1) Query Optimization
2) ROW BY ROW Update --> Never Recommended for such a huge records count..

Approach to Optmization.

1) Use Merge Statement as it will Bundle up the block into pure SQL thus will enhance your query.

2) Use Function based Index as i can see there are lot of functions like "LOWER" is used in the query "WHERE" conditions.
NOTE : [Over INDEXING may also cause deterioration in the performance}

3) Last but not the least if you have to go by Anonymous block only then avoid using ROW BY ROW Update. Try using Bulk collect Options as illustrated below.
Since i do not have workstation handy with me please bear with any Syntax Errors.
Let me know if this helps.

DECLARE
  TYPE p1
IS
  TABLE OF <table_name>.<COLUMN_NAME>%TYPE;
  PersonNo1 p1;
TYPE p2
IS
  TABLE OF <table_name>.<COLUMN_NAME>%TYPE;
  PersonNo2 p2;
TYPE status
IS
  TABLE OF <table_name>.<COLUMN_NAME>%TYPE;
  P2_PERSON_STATUS status;
BEGIN
  SELECT p1.person_no AS "PersonNo1" ,
    p2.person_no      AS "PersonNo2" ,
    p2.person_status  AS "P2_PERSON_STATUS" BULK COLLECT
  INTO PersonNo1,
    PersonNo2,
    P2_PERSON_STATUS
  FROM person p1 ,
    hr.person p2
  WHERE (lower(p1.person_no)                                                  = lower(p2.person_no)
  OR SUBSTR(p1.person_no, instr(p1.person_no, '-') + 1, LENGTH(p1.person_no)) = SUBSTR(p2.person_no, instr(p2.person_no, '-') + 1, LENGTH(p2.person_no)));
  FORALL I                                        IN PersonNo1.FIRST..PersonNo1.LAST
  UPDATE person
  SET person_status_id                                               = DECODE(lower(P2_PERSON_STATUS(I)), 'y', 1, 'n', 2)
  WHERE (lower(person_no)                                            = lower(PersonNo2(I))
  OR SUBSTR(person_no, instr(person_no, '-') + 1, LENGTH(person_no)) = SUBSTR(PersonNo2(I), instr(PersonNo2(I), '-') + 1, LENGTH(PersonNo2(I))));

COMMIT;

END;
Avrajit Roy
  • 3,233
  • 1
  • 13
  • 25