1

I am working in Oracle version > 10gR2. I have a table with 20 million records. I want to process all records before exporting them to file, but performance of this is very slow.

Do you have any advice? Below is my code:

declare   
  l_temp_detail clob := null;    
  icount number:=0;

  cursor c_test is
    select /*+ PARALLEL(12) */ *
      from test_table t;

  TYPE t_test IS TABLE OF test_table%ROWTYPE INDEX BY PLS_INTEGER;
  l_test_items t_test;   

BEGIN
  l_temp_detail := '';  

  OPEN c_test;

  LOOP
    FETCH c_test 
      BULK COLLECT INTO l_test_items LIMIT 25000; 
    EXIT WHEN l_test_items.COUNT = 0;

    FOR i in l_test_items.first .. l_test_items.last      LOOP    
      icount := icount+1;     
      --doing business here
      l_temp_detail := ....
    END LOOP;      

    dbms_output.put_line(to_char(icount));   
  END LOOP;

  CLOSE c_test;    
END;
ChrisM
  • 1,576
  • 6
  • 18
  • 29
  • 1
    What are you doing to "process" the data? Are you sure that you can't do that processing in SQL? The parallel hint seems rather pointless if you're going to be writing single-threaded code to process the results. Depending on the Oracle version, you may be able to process in parallel using `dbms_parallel_execute` but it's hard to know without knowing what, exactly, your processing consists of. If your processing involves some sort of DML, a `FORALL` loop would be more efficient than a `FOR`. – Justin Cave Oct 29 '14 at 19:34
  • how much slow? bulk collect is not bad, and as Justin said you can improve it with FORALL of course. See this: http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52plsql-1709862.html – Thomas Oct 29 '14 at 19:42
  • 1
    So you're reading 20 million rows and putting the results into a CLOB, then writing the results to a file. Are you putting all 20 millions rows worth of data into the CLOB before writing to the file? I'm left wondering where the slow-down is. If the query in your question represents what's really being done, then the query doesn't do much so the slowdown must be elsewhere. Have you tried using a profiler to determine where the code is spending its time? I think that might be a good next step. Best of luck. – Bob Jarvis - Слава Україні Oct 29 '14 at 22:18
  • you aren't using dbms_output.put_line in your actual code are you? If so, that's probably your problem (its trying to spool to console). Use [UTL_FILE](http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/u_file.htm#BABGGEDF) to write out your data in pl/sql. You should also write to a log table periodically (every 10,000 rows or whatever) so that you know where you're at in the process. For [example](http://stackoverflow.com/questions/4919437/dbms-output-put-line/4921960#4921960) – tbone Oct 30 '14 at 01:41
  • Profile ! Profile ! Profile ! – user272735 Oct 30 '14 at 05:20
  • First determine whether a loop is actually required or whether you can do 'doing business here' in batch – Nick.Mc Nov 01 '14 at 23:46

3 Answers3

0

Couple of things. Use only those columns from test_table which you need. According to your need, seems you'd need only few column from test_table and limited set of records. You won't be concatenating 20 million records together.

Additionally, since table is huge, you may want to check if there are any partitioning and indexes. Using particular partition and right index will enhance the query execution time.

Ram Dwivedi
  • 470
  • 3
  • 11
0

Parallel table functions enable parallel querying and parallel procedural processing.

Sample table and data.

drop table test_table;
create table test_table(a number);
insert into test_table select level from dual connect by level <= 1000000;
commit;

Return type (it's required even if you don't care about the results)

create or replace type number_nt is table of number;

Parallel pipelind function

--The syntax is very picky if you want to enable parallelism.
create or replace function f(p_cursor in sys_refcursor) return number_nt
    pipelined parallel_enable(partition p_cursor by any) is

    l_temp_detail clob := null;    
    icount number:=0;

    type t_test is table of test_table%rowtype index by pls_integer;
    l_test_items t_test;   
begin
    LOOP
        FETCH p_cursor 
            BULK COLLECT INTO l_test_items LIMIT 25000; 
        EXIT WHEN l_test_items.COUNT = 0;

        FOR i in l_test_items.first .. l_test_items.last LOOP    
            icount := icount+1;     
            --doing business here
            --l_temp_detail := ....
        END LOOP;      

        dbms_output.put_line(to_char(icount));   
    END LOOP;
    CLOSE p_cursor;    

    pipe row (icount);
end;
/

Query and results

--You may want to SUM() to get a total count.
--These aren't the results you care about, but they help show the distribution.
select column_value
from table(f(cursor(select /*+ parallel(12) */ * from test_table)));

COLUMN_VALUE
------------
124080
116160
65340
58080
79860
65340
60060
79860
59400
46960
116160
128700
Jon Heller
  • 34,999
  • 6
  • 74
  • 132
0

I haven't actually tested this on 10g, but on Oracle9 processing huge numbers of records in plsql table is a factor of 10 less efficient than processing tables of single values.

So - if you refactor your code from "table of records" to "record of tables" you may see a significant improvement. You dont actually need to bundle the individual tables in a record - I just like it for clarity.

E.g.

declare
  type tName    is table of mytable.name%type index by pls_integer;
  type tAddress is table of mytable.address%type index by pls_integer;
  type tPerson  is record (
                     name     tName;
                     address  tAddress;
                     );
  person tPerson;
begin
  open somecursor;
  loop
    fetch somecursor bulk collect into person.name, person.address limit 25000;
    --
    -- processeing
    --
    exit when ...
  end loop;
  close somecursor;
end;

I can't promise that this will solve your problem for sure though - it depends on what you are actually doing with the data in the "processing" - e.g how much time is spent in plsql and how much is spent in sql doing dml.

Jens Krogsboell
  • 1,093
  • 11
  • 18