-2

I am new in writing regular expression and I have the following Scenario.

I have a string, like :

  string line = "if (true){var data = string.Format(\"something {0} {1}.\", \"is\", \"wrong\");}";

now I need to write a regular expression that just pick the closing curly braces which are not in the double quote

so far I tried this:

    "(^(\"[^\"]*\")(}))+"
  • ^(\"[^\"]*\") : I want to Ignore any substring which is inside double quote, AND
  • (}) : I want to take }
  • +: for at least 1 occurrence.

But it seems I Did something wrong. Could any one please guide me to sort out where I did the wrong?

Thank you.

Ahsan Ahmad
  • 999
  • 1
  • 9
  • 21
  • 2
    What if `line` is something like `if(true /*{*/) { /*}*/ DoSomething() /*... more stuff*/ }`? Unless you're _very_, _very_ sure that your input is fixed, Regex is not the right tool for this kind of work. – xxbbcc Sep 01 '15 at 19:45
  • I need to find all the closing curly braces ( } ) that are not bounded in double quotes, regardless their position – Ahsan Ahmad Sep 01 '15 at 19:47
  • Find and do what? It is important. Split? Or replace with something? – Wiktor Stribiżew Sep 01 '15 at 19:48
  • 1
    So what about `if(true /*"{*/ ) { var s="{" + /* "} */ + ";"; { DoSomething() /* } */ }` (notice no closing `}` on this line (maybe on the next line) – xxbbcc Sep 01 '15 at 19:49
  • 2
    I'm pretty sure Regex is not an appropriate technology for this kind of parsing. You'll probably need to write your own parser. – StriplingWarrior Sep 01 '15 at 19:50
  • 1
    I'd strongly recommend to read popular regex question about matching pairs of tags with regex - http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags. Make sure to read first 10-15 answers as some have useful guidance. – Alexei Levenkov Sep 01 '15 at 19:50
  • this question is not about matching pairs. the string might like " \"e e\" e" I need to find the "e" which is not in double quotes – Ahsan Ahmad Sep 01 '15 at 19:57
  • @cuteteddy You should update your question with what you actually try doing then. A C-like language cannot be reliably parsed with a regular expression, unless you're looking for an extremely narrow case (and even then, it's possible that it can break down). It's usually much easier to write a parser (unless there's already one for the problem you're trying to solve. – xxbbcc Sep 01 '15 at 20:01
  • 1
    @cuteteddy you should clearly specify what you are looking for. The sample string you've posted looks like C# code, which is clearly not good example. Valid C# code can have unbalanced quotes - so good clar sample would clarify your post. (Same applies to showing HTML as sample - RegEx is generally wrong tool for parsing that too). – Alexei Levenkov Sep 01 '15 at 20:02
  • @cuteteddy You need to define what it mean that `e` is not in double quotes. Is this in double quotes: `/* "e" */ var a = e;` ? – xxbbcc Sep 01 '15 at 20:03
  • @cuteteddy Ok, then what about this: `var a = '\"' + e + '\"';` If you're trying to parse a source file of a C-like language, it's not possible with regex. If you're trying to do something else, update your question because at this time it's unclear then. – xxbbcc Sep 01 '15 at 20:07
  • @cuteteddy: Just use `}(?=(?:[^"]|"[^"]*")*$)` then (or `}(?=(?:[^"]|"[^"]*(?:\\.[^"]*)*")*$)`). Does it work? (It should not, but you want to match `}` outside of quotes, then who knows...) – Wiktor Stribiżew Sep 01 '15 at 20:08
  • yes it works. thanks a lot – Ahsan Ahmad Sep 01 '15 at 20:11
  • The first one or the second one? Depending on the answer, I will post an answer, or close the question as duplicate. – Wiktor Stribiżew Sep 01 '15 at 20:13
  • the first one works for me – Ahsan Ahmad Sep 01 '15 at 20:18

1 Answers1

0

You just need these parts of your regex:

(?:\"[^\"]*\")|(})

Regex live here.

  • 1
    Your regex appears to be picking out the strings, not the curly braces that the OP asked for. – StriplingWarrior Sep 01 '15 at 19:47
  • 2
    Let's say straight: it won't do the job correctly. If I do not downvote, someone might do it. Please fix. – Wiktor Stribiżew Sep 01 '15 at 19:49
  • 1
    Unlikely to work correctly as OP seem to want to find braces only in actual code (excluding strings with all possible escapings and possibly comments - single/multiline as not specified otherwise). Expecting some ridiculous construct with `(?<..` to match strings and comments. – Alexei Levenkov Sep 01 '15 at 19:56