1

I have a program that goes through SQL statements, identifying component parts with a space. eg: -

SELECT * FROM tblSales WHERE CustomerID=10 AND Year=2011

Would produce the following as separate components: -

"SELECT","*","FROM","tblSales","WHERE","CustomerID=10","AND" and "Year=2011"

The problem I have however is the use of spaces within values eg.:-

SELECT * FROM tblSales WHERE CustomerNameID='Test Company' AND Year=2011

Using the same space-separating logic, this would produce components of: -

"SELECT,"*","FROM","tblSales","WHERE","CustomerID='Test","Company'","AND" and "Year=2011".

ie the space between "Test" & "Company" causes it to be treated as two separate components.

Question is, Is it possible via RegEx.Replace to replace the spaces between the quotation marks with another character whilst leaving all other spaces in the string in tact?

Shaun.

  • Answer to teh question is YES. :).. See this (stackoverflow.com/questions/7796904/regex-for-matching-literal-strings/7797006) then give it a try and post some examples that you have tried out, i.e. show some effort. – Kashyap Oct 24 '11 at 14:36

2 Answers2

1

Assuming you don't have any escaped quotes, you could simply check if the space you're looking at is followed by an even number of quotes (which means it's outside of a string).

So

splitArray = Regex.Split(subjectString, " (?=(?:[^']*'[^']*')*[^']*$)");

gives you the desired result.

As a verbose regex:

[ ]        # Match a space
(?=        # only if it's followed by...
 (?:       # This subregex:
  [^']*'   #  any number of non-quote characters, followed by a quote
  [^']*'   #  same again (to ensure even number of quotes)
 )*        # zero or more times.
 [^']*     # and any number of non-quote characters until...
 $         # the end of the string.
)          # End of lookahead.
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Seems to me that assuming there aren't escaped quotes is a recipe for disaster... – e.dan Oct 24 '11 at 14:39
  • @e.dan: You're certainly right. Those can be handled, too, but that would make the regex more complicated. – Tim Pietzcker Oct 24 '11 at 14:41
  • I think it borders on impossible, and the OP would be advised to read the famous [HTML parsing with regex](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). SQL is arguably at least as complex as HTML, if not more so. – e.dan Oct 24 '11 at 14:45
  • Given the context by which the SQL commands were written then there won't be any escaped characters and so this will work fine. Thanks – user1011024 Oct 24 '11 at 15:23
1

Possible? Maybe. Maybe not. Certainly not trivial. Parsing SQL with regexes is almost as bad, or quite possibly worse, than parsing HTML with regexes...

You might want to poke around SO and look for other people who have tried to parse SQL...

Community
  • 1
  • 1
e.dan
  • 7,275
  • 1
  • 26
  • 29