7

I'm working on a Java based OSS app SqlHawk which as one of its features is to run upgrade sql scripts against a server.

Microsoft have made it a convention to split a script into batches with the GO statement, which is a good idea but just asking for false matches on the string.

At the moment I have a very rudimentary:

// split where GO on its own on a line
Pattern batchSplitter = Pattern.compile("^GO", Pattern.MULTILINE);
...
String[] splitSql = batchSplitter.split(definition);
...

which kind of works but is prone to being tripped up by things like quoted GO statements or indentation issues.

I think the only way to make this truly reliable is to have an SQL parser in the app, but I have no idea how to go about this, or whether that might actually end up being less reliable (especially given this tool supports multiple DBMSs).

What ways could I solve this problem? Code examples would be very helpful to me here.

Relevant sqlHawk code on github.

Currently using jtds to execute the batches found in the scripts.

Community
  • 1
  • 1
Tim Abell
  • 11,186
  • 8
  • 79
  • 110
  • 1
    I did it your way when I had to do a similar thing. It's only used internally and works fine for us. – brain May 24 '12 at 09:41
  • 1
    Idem here... It's far from fool-proof and you need to decide if you want to support things like 'GO 100'. The main hurdle IMHO is when the GO is part of a /* comment block */. Those require quite a bit of extra processing. (strip out all comments & comment block first, then do the split is what we did) PS: You can prefix the code with `SET PARSEONLY ON` and if there is no 'Incorrect syntax near GO' there is no need to split the code.. otherwise you might use the line-location indicated by the error to 'pinpoint where to split on a `GO`... might work, feels more complex than needed though... – deroby May 24 '12 at 10:40
  • example to complicate things : `EXEC (' SET PARSEONLY ON;SELECT * FROM t_entity; GO; PRINT 0')` – deroby May 24 '12 at 10:41
  • Could your project export the ddl modifications in an XML format then process the statements separately? http://db.apache.org/ddlutils/documentation.html does something like this. – pd40 Jun 02 '12 at 13:37
  • Gave up, left an open issue https://github.com/timabell/sqlHawk/issues/49 & went for requiring GO to be on its own on a line. – Tim Abell Jun 05 '12 at 11:32
  • @TimAbell - did you look at the code I posted? I wouldn't have thought it would be so hard to convert C# to Java. – Matt Whitfield Jun 06 '12 at 22:57
  • I did but I haven't had time to go through it in more detail. It looks good and I've voted up your answer. – Tim Abell Jun 07 '12 at 10:18

2 Answers2

1

GO is a client batch seperator command. You can replace it with ;. It should not be sent in your EXEC dynamic SQL.

USE master
GO --<----- client actually send the first batch to SQL and wait for a response
SELECT * from sys.databases
GO

Should be translated in

Application.Exec("USE master");
Application.Exec("SELECT * from sys.databases");

or you can write it this way:

Application.Exec("'USE master;SELECT * from sys.databases")

More about GO http://msdn.microsoft.com/en-us/library/ms188037(v=sql.90).aspx

PollusB
  • 1,726
  • 2
  • 22
  • 31
  • Thanks but that doesn't give me any clues how I might reliably perform the split in Java. I think you are right about EXEC but I haven't checked. – Tim Abell May 29 '12 at 08:02
1

Ok, so this isn't going to be exactly what you want, but you might find it a start. I released SchemaEngine (which forms the core of most of my products) as open source here. In there, you will find C# code that does what you want very reliably (i.e. not tripping up with strings, comments etc etc). It also support the 'GO x' syntax to repeat a batch x times.

If you download that and have a look in /Atlantis.SchemaEngine/Helpers you'll find a class called BatchParser.cs which contains a method called ParseBatches - which does pretty much what it says on the tin.

Matt Whitfield
  • 6,436
  • 3
  • 29
  • 44