0

I have strings for example and I need to get result from every with one regex if possible.

  1. [ "Italy" ] * 5 - [ "France","Paris" ] + 2 -> Match 1: [ "Italy" ], Match 2: [ "France","Paris" ]
  2. 3*12 * ["Country"] + ["City".] * 2 + [""] -> Match 1: ["Country"], Matching 2: [""]
  3. [Madrid] -> Not match
  4. ["Spain12"] * ["Name "Industry""] -> Match 1: ["Spain12"], Match 2: ["Name "Industry""]
  5. "My issue" / ["Error] ["some!+ name"]"] * 3 + 4 -> Match 1: ["Error] ["some!+ name"]"]

I tried this

\[\s*\"(.*?)\"\s*\]

But in cases 2 and 5 get me wrong result ["City".] * 2 + [""] and ["Error ]["some name"] but need ["Error] ["some name"]"]

I tried this

\[(?:[^\]\[]+|\[(?:[^)(]+|\([^)(]*\])*\])*\]

But get wrong in 2 and 5 cases too

PS: string in brackets can have any numbers, characters and letters

runia
  • 370
  • 6
  • 19

1 Answers1

0

Here is a solution that uses Balancing groups, a feature of .NET Regex that allows you to use a stack to be sure the delimiters, in this case [\s*" and "\s*], are balanced. This gets all of your matches correct, and should be pretty well scalable.

\[\s*"(?>(?!\[\s*"|"\s*]).|\[\s*"(?<Depth>)|"\s*](?<-Depth>))*(?(Depth)(?!))"\s*]

A breakdown of this regex is as follows:

\[\s*"            match the first delimiter
(?>               start atomic group (no backtracking allowed)
(?!\[\s*"|"\s*]). look ahead, make sure next character isn't a delimiter, then capture that character
|\[\s*"(?<Depth>) alternatively, capture an opening delimiter and push it to the <Depth> stack
|"\s*](?<-Depth>) alternatively, capture a closing delimiter and pop the top of the <Depth> stack
)*                close atomic group, repeat it 0 or more times
(?(Depth)(?!))    if the Depth stack has anything in it, fail the match
"\s*]             capture the final closing delimiter

the delimiters are easy: \[ is the opening bracket, then \s* any amount of whitespace, then " a quotation mark. The opening bracket has to be escaped because it has special meaning in Regex, but the closing bracket used in the closing delimiter does not have to be because it's special meaning is tied to the opening bracket, so the closing delimiter is easier: "\s*], just the reverse of the opening one.

Using this, you get the following matches:

"My issue" / ["Error] ["some!+ name"]"] * 3 + 4  -> ["Error] ["some!+ name"]"]
3*12 * ["Country"] + ["City".] * 2 + [""]        -> ["Country"] , [""]
["Spain12"] * ["Name "Industry""]                -> ["Spain12"] , ["Name "Industry""]
[  "Italy"  ] * 5 - [  "France","Paris" ] + 2    -> [  "Italy"  ] , [  "France","Paris" ]
[Madrid]                                         -> NO MATCHES

EDIT: For more information on Balancing Groups, see this wonderful answer:

What are regular expression Balancing Groups?

Zaelin Goodman
  • 896
  • 4
  • 11