4

Response to : Regular Expression to find a string included between two characters while EXCLUDING the delimiters

Hi,I'm looking for a regex pattern that applies to my string including brackets:

[1,2,3,4,5] [abc,ef,g] [0,2,4b,y7]
could be anything including word,digit,non-word together or separated.

I wish to get the group between brackets by \[(.*?)\] but what is the regex pattern that will give me the group between brackets and sub-group strings separated by commas so that the result may be following ??

Group1 : 1,2,3,4,5
 Group1: 1
 Group2: 2
 Group3: 3
 Group4: 4
 Group5: 5

Group2 : abc,ef,g
 Group1: abc
 Group2: ef
 Group3: g

etc ..

Thank you for your help

Community
  • 1
  • 1
Myra
  • 3,646
  • 3
  • 38
  • 47
  • 1
    Is there a particular reason you *must* use RegEx? – Amber Apr 16 '10 at 09:49
  • Question can seem to be easier at first look,because the answer is indeed using string operations.I'm looking for another way around to solve my problem with regex method if it's possible. There is a particular reason for I must use RegEx because I should understand RegEx more specifically other than using string methods. – Myra Apr 16 '10 at 12:52

5 Answers5

6

I agree with @Dav that you would be best using String.Split on each square-bracketed group.

However, you can extract all the data using a single regular expression:

(?:\s*\[((.*?)(?:,(.+?))*)\])+

Using this expression, you will have to process all the captures of each group to get all the data. As an example, run the following code on your string:

var regex = new Regex(@"(?:\s*\[((.*?)(?:,(.+?))*)\])+");
var match = regex.Match(@"[1,2,3,4,5] [abc,ef,g] [0,2,4b,y7]");

for (var i = 1; i < match.Groups.Count; i++)
{
    var group = match.Groups[i];
    Console.WriteLine("Group " + i);

    for (var j = 0; j < group.Captures.Count; j++)
    {
        var capture = group.Captures[j];

        Console.WriteLine("  Capture " + j + ": " + capture.Value 
                                       + " at " + capture.Index);
    }
}

This produces the following output:

Group 1
  Capture 0: 1,2,3,4,5 at 1
  Capture 1: abc,ef,g at 13
  Capture 2: 0,2,4b,y7 at 24
Group 2
  Capture 0: 1 at 1
  Capture 1: abc at 13
  Capture 2: 0 at 24
Group 3
  Capture 0: 2 at 3
  Capture 1: 3 at 5
  Capture 2: 4 at 7
  Capture 3: 5 at 9
  Capture 4: ef at 17
  Capture 5: g at 20
  Capture 6: 2 at 26
  Capture 7: 4b at 28
  Capture 8: y7 at 31

Group 1 gives you the value of each square-bracketed group, group 2 gives you the first item matched in each square-bracketed group and group 3 gives you all the subsequent items. You will have to look at the indexes of the captures to determine which item belongs to each square-bracketed group.

Phil Ross
  • 25,590
  • 9
  • 67
  • 77
  • Capturing groups and merging the results.That's a way,looks like the only way.I accept your response.Thank you. – Myra Apr 16 '10 at 12:54
3

Here's another option that uses CaptureCollections (the only way to do this in a single regex). Where Phil Ross's answer does it all in one match operation, this one does multiple matches. This way, all the individual-item captures are properly grouped according to the bracket pairs where they were found.

string s = @"[1,2,3,4,5] [abc,ef,g] [0,2,4b,y7] ";
Regex r = new Regex(@"\[((?:([^,\[\]]+),?)*)\]");
int matchNum = 0;
foreach (Match m in r.Matches(s))
{
  Console.WriteLine("Match {0}, Group 1: {1}", ++matchNum, m.Groups[1]);
  int captureNum = 0;
  foreach (Capture c in m.Groups[2].Captures)
  {
    Console.WriteLine("  Group 2, Capture {0}: {1}", ++captureNum, c);
  }
}

output:

Match 1, Group 1: 1,2,3,4,5
  Group 2, Capture 1: 1
  Group 2, Capture 2: 2
  Group 2, Capture 3: 3
  Group 2, Capture 4: 4
  Group 2, Capture 5: 5
Match 2, Group 1: abc,ef,g
  Group 2, Capture 1: abc
  Group 2, Capture 2: ef
  Group 2, Capture 3: g
Match 3, Group 1: 0,2,4b,y7
  Group 2, Capture 1: 0
  Group 2, Capture 2: 2
  Group 2, Capture 3: 4b
  Group 2, Capture 4: y7
Alan Moore
  • 73,866
  • 12
  • 100
  • 156
2

You'd be better off using String.Split on your groups to split them once you have the bracket-delimited groups.

Amber
  • 507,862
  • 82
  • 626
  • 550
1

\[(.*?)\] will tell you what is between the brackets, but if you add:

\[(?<NumSequence>.*?)\]

This will assign a group which you can then reference.

EDIT I would then use Phil's Reg Ex as mine above shows how to assign a group.

Neil Knight
  • 47,437
  • 25
  • 129
  • 188
0

I do not think that what you ask is possible to do in a single Regex. Your data seems to have a variable number of comma seperated entries between the brackets, and there are no Regex expressions with a variable number of capturing groups.

Jens
  • 25,229
  • 9
  • 75
  • 117