1

UPDATE: To make this question easier. I have changed the string and now you don't have to worry about BEGIN END at all. Instead I now have only GO statements as terminators. I hope I will get some answers from other people now

I need to write a regular expression which can detect hidden DML statements like INSERT, UPDATE, DELETE in a DDL script (CREATE, ALTER, DROP).

Example. In the script below it should catch 2 delete table5 statements and the last insert into table3 statement. But it should ignore the insert statements which are actually in the body of the stored proc itself.

If you run this RegEx here : http://regexr.com?33rf3, you will see that I have accomplished 90% of it. The only part that is not working is that it is doing a greedy match until the last GO. I need it to stop at the first GO. I have tried using +? operator to make it non-greedy but it doesn't want to work.

delete table5 /*Regex should find this*/

GO

create proc abc
begin
    insert into table1  /*Regex should IGNORE this*/
    begin
        fake nesting here
        insert into table2 /*Regex should IGNORE this*/
    end
end

GO

delete table5 /*Regex should find this*/

GO

alter proc abc2
begin
  --no statements in this proc
end

GO

insert into table3 /*Regex should find this*/
StackThis
  • 1,262
  • 1
  • 14
  • 23
  • Only .NET regular expression realization can be used in this case because of nested brackets (begin/end). Another option is to assume that it will be limited nesting of begin/end – Teddy Bo Feb 19 '13 at 23:58
  • Forunately I am using C#. Any ideas what the regex would look like? You can also assume if it makes it simple that there are not begin ends...and all statements end with a GO – StackThis Feb 20 '13 at 15:23
  • I think you're barking up the wrong tree. I'd be surprised if you got this to work with regular expressions. http://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns – Jim Mischel Feb 20 '13 at 19:09
  • Actually, it *might* be possible with .NET regex. Some answers to that question give pointers on how you might approach the problem. – Jim Mischel Feb 20 '13 at 19:12
  • 1
    If you don't trust your scripts, then you shouldn't run them, or you should run them from an account which can't do any damage. – John Saunders Feb 20 '13 at 23:37
  • 1
    @John Saunders - without knowing the background why I am doing this, it is unfair of you to add that comment. There is a very solid reason for doing this which I cannot explain here. And stackexchange is for asking technical questions not provide a business reason behind them. And I think I have provided enough information in the question for people to be able to answer it. – StackThis Feb 21 '13 at 02:46
  • Sorry, no. The alternative if you can't trust your scripts is to create a constrained or sandboxed environment in which to run them. Your question doesn't suggest that you've done anything like that. So, again, if you don't trust them to not do damage, then don't run them. – John Saunders Feb 21 '13 at 02:49
  • Unless you're not going to _run_ them at all, in which case, you should say so. – John Saunders Feb 21 '13 at 02:49
  • Yes, I am not going to run them at all in that case..but I don't see the need for stating that. The question is self-contained and fine just the way it is – StackThis Feb 21 '13 at 15:34

1 Answers1

2

GO statements aren't needed.

        var regex1 = new Regex(@"
          (?s)
          (create|alter|drop).+?
          begin
            (?:
               (?> (?! begin | end ) .
                  | begin (?<counter>)
                  | end  (?<-counter>)
                )*
            )
            (?(counter)(?!))
          end
          "
        , RegexOptions.IgnorePatternWhitespace
        );

        var regex2 = new Regex(@"(?:insert|update|delete).+");

        var result = regex2.Matches(regex1.Replace(your_input, ""))
            .Cast<Match>()
            .Select(m => m.Value);

This code deletes all create/alter/drop statements and than looks for delete/alter/drop. I think it could be done using only one regex, but it is all I can suggest now, sorry:)

Teddy Bo
  • 679
  • 8
  • 19
  • Thanks Teddy. I will definitely update you when I get to work tomorrow and try this out. This is sufficient for me. I am not really looking for the most optimal solution right now. The one that works is the right solution for me :) – StackThis Feb 21 '13 at 02:48