1

Need a compound expression for

" from" such that " from" is not within parenthesis

(ignoring those which are in parenthesis) here a=" from"; b="("; and c=")";

The closest (but invalid) pattern I could write is

string pat = @"^((?!\(.* from.*\)).)* from((?!\(.* from.*\)).)*$";

my expression denies if any " from" is present in parenthesis but i want to strictly ignore such " from"


Matches should be found in:

1: " from" 2:select field1 from t1 (select field1 from t1)   ---- 1 time in both
3: select field1 from t1 (select field1 from t1)select field1 from t1  ---2 times

Strings not containing matches:(Because i want to ignore the " from" within parenthesis)

1: select field1 no_f_rom_OutOf_Parenthesis t1 (select field1 from t1)
2: (select field1 from t1)  3: "" (Empty String) 4. No word as form
0 times in all four strings




Relevant Material: (not much necessary to read)

The most helpful link nearer to my question telling how to match 'pattern' but not 'regular' has been a reply by stanav at Jul 31st, 2009, 08:05 AM in following link...

http://www.vbforums.com/archive/index.php/t-578417.html

Also: Regex in C# that contains "this" but not "that

Also: Regular expression to match a line that doesn't contain a word?

I have studied/searched about a week but still Its complex for me:)

Community
  • 1
  • 1
Sami
  • 8,168
  • 9
  • 66
  • 99
  • Can the parentheses be nested? – primfaktor Aug 17 '12 at 06:35
  • Yes even they can be nested. But I have to find only that " from" which is not in any parenthesis. If it is found. Expression is matched – Sami Aug 17 '12 at 06:37
  • For SQL, you need a parser. For "hello bro! Whats up" you might be able to use regex. – H H Aug 17 '12 at 06:37
  • I don't know if .NET's regexes can handle nested patterns, Perl's can be “recursive” AFAIK. – primfaktor Aug 17 '12 at 06:40
  • @HenkHolterman. Yes expressions are varibale – Sami Aug 17 '12 at 06:41
  • give more examples of valid and invalid output.. everything that u wrote may make sense to u but not others..so NEED more examples! – Anirudha Aug 17 '12 at 08:16
  • @Anirudha bro. I have made it shorter but tried to elaborate with examples as you told. Is it making some sense now? – Sami Aug 17 '12 at 10:47
  • @SamiAkram if i got u right,u want queries which are not in parenthesis..right!If that's so check out the answer! – Anirudha Aug 17 '12 at 11:11
  • @Anirudha I can have a string 'bb hj exp ab' and 'hh exp bb hj exp ab' if I say ignore any exp surrounded by 'bb.*exp.*ab' and match others, I would get matched only first 'exp' of string2. Here " from" = "exp" and parenthesis '(',')' = 'bb','ab' if I get the solution 'a' ignoring 'a' in 'b'. It will be some combination of any two expressions and dynamically work for new situations – Sami Aug 17 '12 at 14:19
  • it would be too complicated..try to get out of `Regex` and think what would you do if there were no `regex`..try to break the code into parts.it will help you – Anirudha Aug 17 '12 at 14:24

2 Answers2

2

The following should work, even with arbitrarily nested parentheses:

if (Regex.IsMatch(subjectString, 
    @"\sfrom           # Match ' from'
    (?=                # only if the following regex can be matched here:
     (?:               # The following group, consisting of
      [^()]*           # any number of characters except parentheses,
      \(               # followed by an opening (
      (?>              # Now match...
       [^()]+          #  one or more characters except parentheses
      |                # or
       \( (?<DEPTH>)   #  a (, increasing the depth counter
      |                # or
       \) (?<-DEPTH>)  #  a ), decreasing the depth counter
      )*               # any number of times
      (?(DEPTH)(?!))   # until the depth counter is zero again,
      \)               # then match the closing )
     )*                # Repeat this any number of times.
     [^()]*            # Then match any number of characters except ()
     \z                # until the end of the string.
    )                  # End of lookahead.", 
    RegexOptions.IgnorePatternWhitespace))

As a single line regex ("The horror! The horror!"), if you insist:

if (Regex.IsMatch(subjectString,@"\sfrom(?=(?:[^()]*\((?>[^()]+|\((?<DEPTH>)|\)(?<-DEPTH>))*(?(DEPTH)(?!))\))*[^()]*\z)"))
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • ! Please write at bottom, the expression in a single line without comments as i may counter an error while rewriting it. – Sami Aug 17 '12 at 11:40
  • What is depth? I had never seen it before. Is it a special word or variable? \z should be replaced by $ in C#? – Sami Aug 17 '12 at 11:47
  • @SamiAkram: It is a variable. `\z` is more specific than `$` (it really only matches at the end of the string, not before any closing whitespace, which is why I prefer it generally), but in this case, you can just as well use `$`. – Tim Pietzcker Aug 17 '12 at 11:52
  • string q = "select hi from t1 in (select hello from t2)"; If(Regex.IsMatch(q, @"\sfrom(?(?:[^()]*\((?>[^()]+|\( (?)|\) (?<-DEPTH>))*(?(DEPTH)(?!))\))*[^()]*\z)", RegexOptions.IgnorePatternWhitespace)) Got Error: Parsing Error. Quantifier {x,y} following nothing – Sami Aug 17 '12 at 13:15
  • @SamiAkram: It looks like you lost quite a few backslashes when copying the regex. Also, the error doesn't make much sense - there is no `{x,y}` quantifier anywhere in this regex. – Tim Pietzcker Aug 17 '12 at 14:32
  • There are 6 "\"s i copied all. I also don't know whats {x,y} quantifier however i had faced it whenever i had syntactical inaccuracy in using '*' – Sami Aug 17 '12 at 14:46
  • Please make an edit/update to the answer and add a tested straight line of code (which could meet the question requirement which you have already understood) with no explanation and keep above as it is. Please :) – Sami Sep 13 '12 at 17:37
  • @Sami: I've just tested it (in Visual C# 2008). It works as is. The condition is met on the string `"foo (bar (baz)) from (spam)"` and is not met on `"foo (bar (baz)) (from (spam))"`. – Tim Pietzcker Sep 13 '12 at 18:02
  • And what about "foo (bar (baz)) from (from (spam))" ? It must meet it as well because there exists " from" which is not in any parenthesis. If it meets then if you don't mind please copy paste your pattern string from visual studio in answer or comment. – Sami Sep 13 '12 at 18:10
  • @Sami: Yes, it works too. I've added the comment-free regex, but please use the verbose one. That one's hard enough to read already... – Tim Pietzcker Sep 13 '12 at 18:11
  • Thanks a lot sir. Its working perfectly. I will prefer the verbose one by making comparison of both. I will use that one because now i have got sure, that's alright. Although I could not get it working, still I was strongly suspecting it the right solution that is why i requested you :) Thanks again.... – Sami Sep 13 '12 at 18:26
  • Just as a remark, this assumes that you parentheses overall are balanced, it will fail to match "from" outside of a bunch of parentheses if some of them aren't balanced. – jwg Mar 08 '13 at 07:55
1

This may be what you want.

string s="select field1 dfd t1 (select field1 from t1)select field1 from t1";
Regex r=new Regex(@"(?<=\)|^)\bselect\b.*?\bfrom\b.*?(?=\()",RegexOptions.RightToLeft);
r.Replace(s,"HELL yeah");
Anirudha
  • 32,393
  • 7
  • 68
  • 89
  • Yes Anirudha. You have solved my problem. Thanks for it. I have used something like that. It is solving my current problem. But not the solution to title as I can have ignoring criteria as 'bb hj exp ab' ignore any exp surrounded by 'bb.*exp.*ab' if I get the solution 'a' ignoring 'a' in 'b'. It will be some combination of any two expressions and dynamically work for new situations. Anyhow you solved my problem, better than i had done, so i am up-voting you. Thanks a lot. Any further help is welcome too. – Sami Aug 17 '12 at 11:37
  • 1
    This fails with nested parentheses. – Tim Pietzcker Aug 17 '12 at 11:38