Let's say that I have a sql script that looks like this:
--split statement 1
ALTER TABLE abs
ADD (make VARCHAR2(2 byte),
model varCHAR2(12 BYTE),
built_on DATE,
serial varchar2(123 BYTE));
/
--split statement 2
declare
begin
null;
end;
/
--split statement 3
insert into test (v,a,c)
values ('1','jjoe;','232');
--split statement 4
create or replace function BLAH_BLAH(i_in varchar2)
as
l_one varchar2(12);
l_two varchar2(12);
l_three varchar2(12);
begin
l_one := 1;
l_two := 3;
insert into test (v,a,b)
values ('1','jjoee;','232');
exception when no_data_found then
l_three := 3;
end;
/
Basically the script can have DML, DCL, DDL, and anonymous blocks. I want to be able to split each statement and execute them individually, but of course in order that they appear.
I thought of using a regular expression, and I believe the logic would have to be something like this:
1) if the string starts with create|alter|drop|declare, get everything from the start of that string down to the semi-colon that is followed by a a new line, followed by a forward slash (the key here is that in the event of an anonymous block, we have to ignore DML until we reach the end).
2) if the string starts with insert|delete|update|merge (again, ignored if we are already in a block that applies to requirement 1), get everything from the start of that string down to the semi-colon that is followed by a new-line with no forward slash.
So far I've come up with this in Python:
sql_commands = re.split('(?i)(((create|alter|drop|merge)+)(?m);(\s*)\n(\s*))|(;(\s*)\n(\s*)/)',sql_script)
but everytime I try to move forward with the other requirements the regular expression starts to not work (and actually the output is kind of funky as it is) and becomes complicated enough for me to get lost in.
I'd like to have this done in either Python or Java (Java, I suppose, would actually be preferred given this is an oracle db)
This doesn't have to be a regular expression if regex isn't really up to this task. My ultimate goal is to split out each statement and run it individually so I can catch any errors that come up and gracefully handle them.