4

I have some text like

The quick brown [fox] jumps over the lazy [dog]

If I use the regex

\[(.*?)\]

I get matches as

fox
dog

I am looking for a regex which works even when one of the braces are missing.

For example, if I have text like this

The quick brown [fox jumps over the lazy [dog]

I want the matches to return "dog"

Update: Another example, if I have text like this

The quick brown [fox] jumps over the lazy dog]

I want the matches to return "fox"

The text can have multiple matches and multiple braces can be missing too :(.

I can also use C# to do substring of the results I get from regex matches.

3 Answers3

4

Try this one: \[[^[]*?\]

It will skip all matches if it contains [ character.

MaKCbIMKo
  • 2,800
  • 1
  • 19
  • 27
  • This is less efficient than my suggestion as lazy matching is costlier than greedy one with an appropriate character class. Also, [it does not capture `fox`, it matches `[fox]`](http://regexstorm.net/tester?p=%5c%5b%5b%5e%5b%5d*%3f%5c%5d&i=The+quick+brown+%5bfox%5d+jumps+over+the+lazy+dog%5d%0d%0aThe+quick+brown+%5bfox%5d+jumps+over+the+lazy+%5bdog%5d%0d%0aThe+quick+brown+%5bfox+jumps+over+the+lazy+%5bdog%5d). – Wiktor Stribiżew Apr 28 '16 at 16:54
  • I just mean that it is not the most efficient pattern for the current task. I feel you are not sure how lazy matching works. See [this answer of mine](http://stackoverflow.com/questions/36770799/perl-regex-matching-optional-phrase-in-longer-sentence/36787675#36787675) on how lazy and greedy quantified patterns work. – Wiktor Stribiżew Apr 28 '16 at 17:05
  • Thanks for the link! Really useful! And I got a question: Is lazy always 'faster' than greedy one? Is it might be than for all these lazy checks it will require more time? – MaKCbIMKo Apr 28 '16 at 17:14
  • Sometimes, lazy, sometimes greedy, sometimes unrolled patterns are faster. A lot depends on the content, on the quantified pattern. In your cases, each character that is not `[` between `[` and `]` is skipped first to try to match `]`, and if it is not, the lazy pattern is expanded. In my pattern, `[^][]*` matches 0+ characters other than `[` and `]` *at once* up to the first `]`. Just go to regex101.com and test my regex and yours and pay attention to the number of steps required to match the string and see the regex debugger page. – Wiktor Stribiżew Apr 28 '16 at 17:19
1

Here you go: \[[^\[]+?\]

It just avoids capturing [ with the char class.

Laurel
  • 5,965
  • 14
  • 31
  • 57
1

If you plan to match anything but [ and ] between the closest [ and ] while capturing what is inside, use

\[([^][]*)]

Pattern details

  • \[ - a literal [
  • ([^][]*) - Group 1 capturing 0+ characters other than [ and ] (as [^...] is a negated character class and it matches all characters other than those defined inside the class) (this Group 1 value is accessed via Regex.Match(INPUT_STRING, REGEX_PATTERN).Groups[1].Value)
  • ] - a literal ] (it does not have to be escaped outside a character class)

See the regex demo and here is C# demo:

var list = new List<string>() {"The quick brown [fox] jumps over the lazy dog]",
        "The quick brown [fox] jumps over the lazy [dog]",
        "The quick brown [fox jumps over the lazy [dog]"};
list.ForEach(m =>
             Console.WriteLine("\nMatch: " + 
                Regex.Match(m, @"\[([^][]*)]").Value + // Print the Match.Value
                "\nGroup 1: " + 
                Regex.Match(m, @"\[([^][]*)]").Groups[1].Value)); // Print the Capture Group 1 value

Results:

Match: [fox]
Group 1: fox

Match: [fox]
Group 1: fox

Match: [dog]
Group 1: dog
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563