3

I need to parse SQL "CREATE TABLE" statements, to get the field list and their properties.

It is mostly MySQL syntax, which should be (with incontrolable nested brackets like this example :

CREATE TABLE 'tablename' (
field1 INT,
field2 VARCHAR(10),
field3 FLOAT NOT NULL DEFAULT 0,
PRIMARY KEY (field1, field2),
(CONSTRAINT fk FOREIGN KEY whatever)
) options;

I can easily isolate the column definitions but then I'd like to split the definitions. Splitting on commas is not possible as these can be in nested brackets.

I could iterate though the string to count opening and closing string delimiters (quotes etc.), count opening and closing brackets, so I can detect "first level" commas and split on them.

But I wonder if there could not be more effective ways, like for example using regular expressions, but I'm not good enough at them to determine if some regex could match such complex syntax ...

Neil Knight
  • 47,437
  • 25
  • 129
  • 188
Sierramike
  • 371
  • 4
  • 16

2 Answers2

0

maybe its will be usefull. You will use it like a sql parser. And this is for regular expression

Community
  • 1
  • 1
Victor Tomaili
  • 641
  • 7
  • 10
  • Same as above, so I decided to go on my own, writing a new "IndexOf" function that finds separators as commas taking care about if they are in nested parenthesis or inside a string (specifying a string delimiter which can be double quote, or the famous MySQL backtick `). Maybe it could have performance issues, but it's for an occasionnal parser, so no matter ... Anyways, thanks for the regular expression link, very useful !!! – Sierramike Mar 27 '14 at 15:21
  • İts about small string parse. İf u dont do it for 10000 Times u will dont feel performance issue. Good luck – Victor Tomaili Mar 27 '14 at 15:42
  • I was more searching for a good practice and more maintainable code rather than parformance anyways, but at the end it works so I'll go for it, thanks for your kind comment Victor. – Sierramike Apr 03 '14 at 07:47
0

What you're looking for is a tokenizer. Microsoft actually includes one as part of visual studio for SQL (Microsoft.Data.Schema.Dom). I'm not certain if this is SQL server only, but even that may be ok depending on what you're trying to parse.

Further, this link directs you to some alternative ways to parse SQL.

Community
  • 1
  • 1
Giovanni Galbo
  • 12,963
  • 13
  • 59
  • 78
  • Many of the links are related to Select statements, or only to SQL Server specific syntax, in my case it's MySQL Create Tables essentialy, so it may not work very well ... – Sierramike Mar 27 '14 at 15:19
  • The link was to an example tokenizer, one that I'm aware of. You'll need to find one for MySQL. Here is a commercial product that supports MySQL: http://www.sqlparser.com/index.php – Giovanni Galbo Mar 28 '14 at 13:40