4

Let's say that I have a sql script that looks like this:

--split statement 1
ALTER TABLE abs
  ADD (make    VARCHAR2(2 byte),
     model    varCHAR2(12 BYTE),
     built_on DATE,
     serial    varchar2(123 BYTE));
    /

    --split statement 2 
     declare
     begin
         null;
     end;
     /

     --split statement 3
     insert into test (v,a,c)
     values ('1','jjoe;','232');

     --split statement 4
     create or replace function BLAH_BLAH(i_in varchar2)
     as
         l_one varchar2(12);
         l_two varchar2(12);
         l_three varchar2(12);
     begin
         l_one := 1;
         l_two := 3;

         insert into test (v,a,b)
         values ('1','jjoee;','232');


    exception when no_data_found then
        l_three := 3;


    end;
    /

Basically the script can have DML, DCL, DDL, and anonymous blocks. I want to be able to split each statement and execute them individually, but of course in order that they appear.

I thought of using a regular expression, and I believe the logic would have to be something like this:

1) if the string starts with create|alter|drop|declare, get everything from the start of that string down to the semi-colon that is followed by a a new line, followed by a forward slash (the key here is that in the event of an anonymous block, we have to ignore DML until we reach the end).

2) if the string starts with insert|delete|update|merge (again, ignored if we are already in a block that applies to requirement 1), get everything from the start of that string down to the semi-colon that is followed by a new-line with no forward slash.

So far I've come up with this in Python:

sql_commands = re.split('(?i)(((create|alter|drop|merge)+)(?m);(\s*)\n(\s*))|(;(\s*)\n(\s*)/)',sql_script)  

but everytime I try to move forward with the other requirements the regular expression starts to not work (and actually the output is kind of funky as it is) and becomes complicated enough for me to get lost in.

I'd like to have this done in either Python or Java (Java, I suppose, would actually be preferred given this is an oracle db)

This doesn't have to be a regular expression if regex isn't really up to this task. My ultimate goal is to split out each statement and run it individually so I can catch any errors that come up and gracefully handle them.

Patrick Bateman
  • 271
  • 2
  • 5
  • 14
  • So you are taking a script as input, that have multiple calls in it, and you want to split it apart, and call each part individually? – Jacob Schoen Dec 06 '12 at 14:46
  • That's exactly right. The goal here is if part of the script fails, I either want to flashback the changes or continue running the script depending on the failure. – Patrick Bateman Dec 06 '12 at 16:12

2 Answers2

2

Pseudeo-Code, not tested, just to show the idea:

while (line = readLine()) {
    String cmdString = null;
    if (line.beginsWith("create" || line.beginsWith("alter") ...) {
       String previousLine = line;
       while (line = readLine()) {
          if (line.equals("/") && previousLine.endsWith(";")) {
            executeSQL(cmdString);
            break;
          }
          previousLine = line;
          cmdString = cmdString + line;
       }
    }
    if (line.beginsWith("insert" || line.beginsWith("update") ...) {
       String previousLine = line;
       while (line = readLine()) {
          if (line.equals("\n") && previousLine.endsWith(";")) {
            executeSQL(cmdString);
            break;
          }
          previousLine = line;
          cmdString = cmdString + line;
       }
    }
    // skip others
}
Udo Klimaschewski
  • 5,150
  • 1
  • 28
  • 41
  • looks promising, i've had my head so wrapped around regular expressions that I didn't think of such a simple loop would handle this. I'll test this and get back to you – Patrick Bateman Dec 06 '12 at 15:42
  • This is great stuff, I converted this to Python and it pretty much works as intended. Need to strip whitespace and tabs. Also needed to add "cmdString = cmdString + line;" below "String previousLine = line;" to catch the very first line recorded before entering the second loop. Thanks for this. – Patrick Bateman Dec 06 '12 at 16:10
  • You are welcome. I prefer such solutions. Much more readable and maintainable than such Regex nightmares like you tried. And I would bet, even faster in execution. I read a nice programming tip the other day: Allways code as if the person who will maintain your code is a serial killer that knows where you live. – Udo Klimaschewski Dec 06 '12 at 18:52
-2

Depending on your goal you could feed the file to SQL*Plus, after having inserted the following command at the end of each statement:

pause Press any key to proceed
colemar
  • 125
  • 4
  • Um, that does not solve the problem of finding the individual statements, which is the problem he is trying to solve. – Jacob Schoen Dec 06 '12 at 15:03
  • Quite right jschoen, we already use SQL*Plus but the problem is sql*plus will continue to execute the entire script even if one part fails. – Patrick Bateman Dec 06 '12 at 15:37
  • @Patrick If that is the problem, then use WHENEVER SQLERROR EXIT SQL.SQLCODE – colemar Dec 06 '12 at 19:41
  • If you can use sqlplus >=11 you can even enable an error logging, see http://tkyte.blogspot.it/2010/04/new-thing-about-sqlplus.html – colemar Dec 06 '12 at 19:52
  • @colemar sorry but that is not what I ma looking for. I don't want the script to exit WHENEVER a sql error occurs, it depends on the situation and that's why I am going to these lengths, otherwise I'd agree with you. For example: if "drop table abc;" fails because the specific error is "can't drop table because table does not exist", I don't want the entire script to stop running, it's OK the table doesn't exist because I am trying to drop it anyways. Something more serious like "tablespace cannot extend by..." would be something to stop for. – Patrick Bateman Dec 07 '12 at 14:31
  • @Patrick Do you want to process legacy sql scripts intended for SQL*Plus? If not, you can introduce a requirement that dictates that every statment must end with "/" alone on a line (be it a select or a create procedure); this is compatible with SQL*Plus and greatly simplifies the regular expression needed to split the file in records (every stament is a record) or the regular expression to match the records. In awk it would be: RS="^ */ *\n" – colemar Dec 07 '12 at 15:56
  • @colemar yes and no. Our legacy scripts are not consistent in using "/" (example, "alter session..." in our scripts do not have the forward slash) so that's a policy that will have to change anyways but we are also using what I am writing for future scripts and not legacy scripts. Otherwise yes, I am expecting a forward-slash after DDL and just a like-break/line-feed after a semi-colon for DML. – Patrick Bateman Dec 07 '12 at 16:11