3

We are trying to replace all matching patterns (regex) in a string builder with their respective "groups".

Firstly, we are trying to find the count of all occurrences of that pattern and loop through them (count - termination condition). For each match we are assigning the match object and replace them using their respective groups.

Here only the first occurrence is replaced and the other matches are never replaced.

      *str* - contains the actual string

      Regex - ('.*')\s*=\s*(.*)

To match pattern:

    'nam_cd'=isnull(rtrim(x.nam_cd),''), 
    'Company'=isnull(rtrim(a.co_name),'')

Pattern : created using https://regex101.com/

*matches.Count* - gives the correct count (here 2)


String pattern = @"('.*')\s*=\s*(.*)";
MatchCollection matches = Regex.Matches(str, pattern);
StringBuilder sb = new StringBuilder(str);
Match match = Regex.Match(str, pattern);

for (int i = 0; i < matches.Count; i++)
{
    String First = String.Empty;
    Console.WriteLine(match.Groups[0].Value);
    Console.WriteLine(match.Groups[1].Value);

    First = match.Groups[2].Value.TrimEnd('\r');
    First = First.Trim();
    First = First.TrimEnd(',');

    Console.WriteLine(First);

    sb.Replace(match.Groups[0].Value, First + " as " + match.Groups[1].Value) + " ,", match.Index, match.Groups[0].Value.Length);
    match = match.NextMatch();
}

Current output:

SELECT DISTINCT
         isnull(rtrim(f.fleet),'') as 'Fleet' ,
        'cust_clnt_id' = isnull(rtrim(x.cust_clnt_id),'')

Expected output:

SELECT DISTINCT
 isnull(rtrim(f.fleet),'') as 'Fleet' ,
 isnull(rtrim(x.cust_clnt_id),'') as 'cust_clnt_id'
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
SaiKanth_K
  • 163
  • 1
  • 12
  • What is the *initial* text, please? – Dmitry Bychenko Jul 02 '19 at 07:06
  • @DmitryBychenko `SELECT DISTINCT 'Fleet'=isnull(rtrim(f.fleet),''), 'cust_clnt_id' = isnull(rtrim(x.cust_clnt_id),'')` that is the original string. – SaiKanth_K Jul 02 '19 at 07:10
  • 4
    A regex solution like this is too fragile. If you need to parse any arbitrary SQL, you need a dedicated parser. You may try `Regex.Replace(s, @"('[^']+')\s*=\s*(\w+\((?>[^()]+|(?\()|(?<-o>\)))*\))", "\n $2 as $1")` ([**demo**](http://regexstorm.net/tester?p=%28%27%5b%5e%27%5d%2b%27%29%5cs*%3d%5cs*%28%5cw%2b%5c%28%28%3f%3e%5b%5e%28%29%5d%2b%7c%28%3f%3co%3e%5c%28%29%7c%28%3f%3c-o%3e%5c%29%29%29*%5c%29%29&i=+%27Fleet%27%3disnull%28rtrim%28f.fleet%29%2c%27%27%29%2c+%27cust_clnt_id%27+%3d+isnull%28rtrim%28x.cust_clnt_id%29%2c%27%27%29&r=%0d%0a+++%242+as+%241)), but it still may fail. – Wiktor Stribiżew Jul 02 '19 at 07:27
  • @WiktorStribizew : That expression worked for me ... thank you. – SaiKanth_K Jul 02 '19 at 11:08

1 Answers1

0

A regex solution like this is too fragile. If you need to parse any arbitrary SQL, you need a dedicated parser. There are examples on how to parse SQL properly in Parsing SQL code in C#.

If you are sure there are no "wild", unbalaned ( and ) in your input, you may use a regex as a workaround, for a one-off job:

var result = Regex.Replace(s, @"('[^']+')\s*=\s*(\w+\((?>[^()]+|(?<o>\()|(?<-o>\)))*\))", "\n $2 as $1");

See the regex demo.

Details

  • ('[^']+') - Capturing group 1 ($1): ', 1 or more chars other than ' and then '
  • \s*=\s* - = enclosed with 0+ whitespaces
  • (\w+\((?>[^()]+|(?<o>\()|(?<-o>\)))*\)) - Capturing group 2 ($2):
    • \w+ - 1+ word chars
    • \((?>[^()]+|(?<o>\()|(?<-o>\)))*\) - a (...) substring with any amount of balanced (...)s inside (see my explanation of this pattern).
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Is there any generic regex where we can also process expressions like `'City'=a.city_name , 'State'=a.st_cd, ` Where the table Alias is optional. – SaiKanth_K Jul 02 '19 at 12:34
  • @SaiKanth_K Maybe `Regex.Replace(s, @"('[^']+')\s*=\s*(\w+(?:\.\w+)*)", "$2 as $1")`. See [this demo](http://regexstorm.net/tester?p=%28%27%5b%5e%27%5d%2b%27%29%5cs*%3d%5cs*%28%5cw%2b%28%3f%3a%5c.%5cw%2b%29*%29&i=%27City%27%3da.city_name%2c%0d%0a%27State%27%3da.st_cd%2c&r=%242+as+%241). – Wiktor Stribiżew Jul 02 '19 at 12:45
  • Stribizew : The first expression doesn't serve of input like `'City'=a.city_name` The second expression performs incorrect operations on the first kind of input Below are examples when using second expression ( `@"('[^']+')\s*=\s*(\w+(?:\.\w+)*)" `) Input : `'Address2'=a.addr_2_desc,` `'Address3'=isnull(rtrim(a.addr_3_desc),''),` Output : `a.addr_2_desc as 'Address2',` `isnull as 'Address3'(rtrim(a.addr_3_desc),''),` – SaiKanth_K Jul 03 '19 at 09:23
  • 1
    @SaiKanth_K You should have mentioned you wanted to use one regex for both types of input. Use `('[^']+')\s*=\s*(\w+(?:\.\w+)*(?:\((?>[^()]+|(?\()|(?<-o>\)))*\))?)`, see [demo](http://regexstorm.net/tester?p=%28%27%5b%5e%27%5d%2b%27%29%5cs*%3d%5cs*%28%5cw%2b%28%3f%3a%5c.%5cw%2b%29*%28%3f%3a%5c%28%28%3f%3e%5b%5e%28%29%5d%2b%7c%28%3f%3co%3e%5c%28%29%7c%28%3f%3c-o%3e%5c%29%29%29*%5c%29%29%3f%29&i=SELECT+%27City%27%3da.city_name%2c+%27Fleet%27%3disnull%28rtrim%28f.fleet%29%2c%27%27%29%2c+%27cust_clnt_id%27+%3d+isnull%28rtrim%28x.cust_clnt_id%29%2c%27%27%29&r=%0d%0a+++%242+as+%241). – Wiktor Stribiżew Jul 03 '19 at 10:01
  • Stribizew : Apologies for the miscommunication .... Thank you for the solution that's exactly what i needed. – SaiKanth_K Jul 03 '19 at 10:19