1

For my job I need to prepare two tables (CTAS) and then do some joins between them. For this job I created a script (run it in SQL Developer) which consequentially creates these two tables one after another. Since these two tables are not related I'd like to start creating them in parallel. Is it possible in SQL script to start two table creations (or two other scripts) in parallel and then proceed when both finish their jobs?

sbrbot
  • 6,169
  • 6
  • 43
  • 74
  • 1
    Single session may perform only one operation (SQL statement) at a time. You have to parallelize the script either at the client side (invoking child processes with new connections) or at the DB side creating separate sessions by `dbms_scheduler` jobs. – astentx Jan 26 '23 at 09:50
  • Thanks @astentx, something like you said presented Littlefoot below. But I was hoping that there could be some hint how to put "job in background" in my script? Nothing like that. – sbrbot Jan 27 '23 at 23:49

2 Answers2

1

Here's one option.

I wouldn't really CTAS - I'd rather create both tables in advance, and then insert rows into them. Why? Because this approach uses stored procedures which - in order to perform DDL (which is CTAS) - require dynamic SQL. Not that it is impossible to do that; on the contrary, but it is way simpler NOT to use it.

I'd create yet another table (let's call it table_done) which contains only one row with two columns: table_1 and table_2 whose values can be 0 (meaning: data for that table is not ready) or 1 (data ready).

Furthermore, I'd create two stored procedures which look the same; the only difference is that each of them inserts rows into its own table:

create procedure p_insert_1 as
begin
  -- remove old data
  execute immediate 'truncate table table_1';

  -- table_1 data not ready
  update table_done set table_1 = 0;

  -- prepare new data
  insert into table_1 (...) select ...;

  -- table_1 data ready
  insert into table_done (table_1) values (1);
  commit;
end;

The 3rd, "main" procedure, is the one you'd run manually. What would it do? Create two one-time database jobs that run immediately, each of them starting its own p_insert procedure so that they run in parallel. That procedure would then (in a loop) check whether both columns in table_done are set to 1 and - if so - continue execution.

create procedure p_main is
  l_job_1 number;
  l_job_2 number;
  --
  l_t1_done number;
  l_t2_done number;
begin
  dbms_job.submit(l_job_1, 'begin p_insert_1; end;');
  dbms_job.submit(l_job_2, 'begin p_insert_2; end;');

  loop
    select table_1, table_2
      into l_t1_done, l_t2_done
      from table_done;

    if l_t1_done = 1 and l_t2_done = 1 then
       -- exit the loop
       exit;
    else 
       -- tables aren't ready yet; wait 60 seconds and try again
       dbms_lock.sleep(60);
    end if;
  end loop;

  -- process data prepared in table_1 and table_2
end;

That's just a simplified idea; I didn't test it myself so I apologize if there are any errors I made. Also,

  • instead of dbms_job, you could choose to use advanced dbms_scheduler
  • if you're on 18c (or later), use dbms_session.sleep instead of dbms_lock.sleep
  • and so forth
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
0

Use SQL parallelism instead of process concurrency. While the words parallelism and concurrency are colloquially interchangeable, in Oracle they have different meanings. Parallelism implies that the SQL engine handles all the coordination of breaking work into little pieces, running those pieces at the same time, and then re-assembling the results at the end. Concurrency implies that the user will create multiple sessions and handle the coordination manually.

For simply creating two tables, parallelism will probably be simpler and faster than concurrency. For parallelism, you may only need to create the table in parallel. (And you probably want to reset the parallelism back to none at the end.)

CREATE TABLE TABLE1 PARALLEL 2 AS SELECT ...;
ALTER TABLE TABLE1 NOPARALLEL;

The PARALLEL 2 option instructs Oracle to run two server processes at the same time while the SQL statement is running. You can easily increase that number, but don't go too high or you'll be stealing too many resources from other sessions.

DBMS_SCHEDULER and other concurrency mechanisms are powerful and useful, but I recommend avoiding them if possible. Running and monitoring scheduler jobs will likely be much more complicated than the preceding code. (Although you may still need to occasionally monitor the parallel SQL statement using a tool like OEM SQL Monitor Reports to ensure that the server is actually using the requested parallelism.)

Jon Heller
  • 34,999
  • 6
  • 74
  • 132
  • From my OP it is obvious, I'm not talking about parallel processing but about parallel execution of different jobs (table creations) - you would, probably it is correct - call it concurrency. – sbrbot Jan 27 '23 at 23:45
  • @sbrbot Sorry if my post came off as being pedantic, that wasn't my intention. I should have more clearly stated that it might be easier to make your current script run twice as fast with SQL parallelism than it would be to cut your script in half and run the two pieces simultaneously. – Jon Heller Jan 28 '23 at 01:44
  • thank you for your response and comment. At the beginning of my script I do alter session and force parallel execution with `ALTER SESSION FORCE PARALLEL DML;` and `ALTER SESSION FORCE PARALLEL QUERY;` So I already do have parallel processing (sharing processes with other users of DWH) but additionally want to have parallel execution. Now I'm thinking if I alter session and put these two parallel executions into scheduler will this session parallelism apply to jobs given to scheduler or I have to use explicit hints /*+ parallel */ – sbrbot Feb 02 '23 at 15:51
  • Mixing parallelism and concurrency should not be necessary. If you're already using parallelism, and you want to double the performance, you should be able to simply double the degree of parallelism. If doubling the DOP doesn't help, then either you've maxed out your server's capacity, or more likely there's some configuration or code issue is preventing parallelism from fully working. I recommend looking at my answer [here](https://stackoverflow.com/a/21132027/409172) for a quick guide on troubleshooting parallel performance. A SQL Monitor report is often necessary to solve these issues. – Jon Heller Feb 02 '23 at 17:25