0

Using balancing groups, it is easy to match brackets. Example, using s*\{(?:[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}, you correctly check the parentheses in this example:

{
   {
   "correct";
   }
}

One issue I have is that this code doesn't work if there is a string with a parentheses inside, i.e.

{
    {
    "wrong}";
    }
}

Checking that the quotes are matched isn't difficult, but I fail to see how to adapt that into the original regex. How would I make it so that the balancing group ignores brackets inside string literals?

MKII
  • 892
  • 11
  • 36
  • I don't think regex is the best tool for data that may be or appear to be nested. You would need to get rid of or capture all the data in quotes so it can't be confused with the other organizational braces. – Daniel Gale Dec 12 '17 at 14:37

1 Answers1

2

Brief

Regex is not the best tool to use for what you're trying to do, however, that doesn't mean it's impossible.


Code

See regex in use here

s*\{(?:"(?:(?<!\\)\\(?:\\{2})*"|[^"])*"|[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}

Note: I simply prepended "(?:(?<!\\)\\(?:\\{2})*"|[^"])*"| to your pattern, so I'll only explain that portion.

A shorter method (thanks to PhiLho's answer on regex for a quoted string with escaping quotes) is as follows.

See regex in use here

s*\{(?:"(?:[^"\\]|\\.)*"|[^{}]|(?<counter>\{)|(?<-counter>\}))+(?(counter)(?!))\}

Explanation

I used the same idea as the regex I recently answered another question with and applied it to yours. It allows for escaped double quotes as well as your open/closing curly braces.

  • " Match this literally
  • (?:(?<!\\)\\(?:\\{2})*"|[^"])* Match either of the following any number of times
    • (?<!\\)\\(?:\\{2})*" Match the following
      • (?<!\\) Negative lookbehind ensuring what precedes is not a literal backslash \
      • \\ Match a literal backslash
      • (?:\\{2})* Match two literal backslashes any number of times (2,4,6,8, etc.)
      • " Match this literally
    • [^"] Match any character except " literally
  • " Match this literally

Note: (?<!\\)\\(?:\\{2})*" ensures it properly matches escaped double quotes ". This basically says: Match any odd number of backslashes preceding a double quote character " such that \", \\\", \\\\\", etc. are valid, and \\", \\\\", \\\\\\" are invalid escaped double quotes " (and thus a string termination)

ctwheels
  • 21,901
  • 9
  • 42
  • 77
  • I'm not sure the s*\ is needed in the example at the beginning. Does it do anything specific? Maybe it should be \s*? – Daniel Gale Dec 12 '17 at 14:55
  • @DanielGale As I said in my answer, I simply prepended my answer to the beginning of the group the OP had already created. I did not want to edit the existing regex as I'm not sure what the `s*` is supposed to do. I suppose it's a simple typo and that the OP intended to put `\s*`, but I simply left it as I'm not sure what the OP's intentions are. – ctwheels Dec 12 '17 at 14:58
  • @DanielGale yeah, that was a typo I made while creating the MVCE. – MKII Dec 12 '17 at 15:09