0

I need help. I want to extract multiple parts of sql statement string. ex. I have the example of string:

UPDATE table SET **XXX** = '1', **YYY** = 2 WHERE ID = 24125; 

So, I want to extract these values in bold ('xxx' and 'yyy'. In general, names of columns changed via Update statement.

Here is the example using the substring function, for selecting only one part, but in my case I need multiple parts:

statement like '%UPDATE%' then SUBSTRING(statement,NULLIF(CHARINDEX('SET',statement),0)+LEN('SET')+1, NULLIF(CHARINDEX('=',statement),0) -(NULLIF(CHARINDEX('SET',statement),0)+LEN('SET')+1))

Thank you!

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
asterix55
  • 93
  • 1
  • 2
  • 8
  • At least I need to extract the column names, whether in the UPDATE statement there is one or more. I tried with T-SQL and substring function, but I can not take multiple parts. Here is my example for one part: statement like '%UPDATE%' then SUBSTRING(statement,NULLIF(CHARINDEX('SET',statement),0)+LEN('SET')+1, NULLIF(CHARINDEX('=',statement),0) -(NULLIF(CHARINDEX('SET',statement),0)+LEN('SET')+1)) – asterix55 May 02 '11 at 21:26
  • 1
    It won't be easy to do this in an at all robust way in TSQL. Why do you have the requirement? (There might be an alternative way to do whatever it is you are trying to do). And where does this statement come from in the first place? – Martin Smith May 02 '11 at 21:41
  • it's usual UPDATE queries done by users. Therefore I need to take the columns that have been updated. In this case, I will need to extract the values between "SET" and "=", and between "," and "=". – asterix55 May 02 '11 at 21:57
  • @asterix55 - You can access `COLUMNS_UPDATED()` inside an `UPDATE` trigger. But where are you getting the `UPDATE` statements themselves from? Are these historic queries? If so using `C#` or similar will likely be much more productive than trying to do it in TSQL. For example see http://stackoverflow.com/questions/5792507/how-can-i-determine-the-parameters-required-by-an-arbitrary-piece-of-t-sql/5793088#5793088 – Martin Smith May 02 '11 at 22:01
  • I have to use T-SQL, since I am saving these values in specific format and then in .txt file. – asterix55 May 02 '11 at 22:10
  • You still haven't answered where you are getting the raw `UPDATE` statements from? – Martin Smith May 02 '11 at 22:12
  • it's from sql audit logs – asterix55 May 02 '11 at 22:16
  • And how robust does it need to be? Bear in mind `UPDATE bar SET foo='UPDATE baz SET x=6'` is perfectly valid TSQL. – Martin Smith May 02 '11 at 22:25

1 Answers1

1

To do the job reliably, you are going to have to write a decent portion of the parser for SQL (or TSQL). And that's a non-trivial exercise!

You have not taken into account all the legal variations of UPDATE statements. For example, you might get:

UPDATE Sometable
   SET (Col1, Col2, Col3) = ((SELECT Value1, Value2, Value3 FROM ... WHERE ...)),
       Col4 = (SELECT Value4 FROM ... WHERE ...)
 WHERE ...;

And that's before you take into account operations like join updates.

Any simple-minded solution that doesn't handle such queries has the potential to run foul of power users, or hackers who realize what you do parse and want to bypass your detection code. Don't forget that I could put a comment - or even several comments - between any of the tokens in the UPDATE statement. These might or might not make it into the audit log - but then there are probably hints which look like comments, and so on.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278