8

Is there a way to get multiple capture groups out of a regex that is using quantifiers? For example, say I have this data (simplified from what I have to deal with):

<td>Data 1</td>
<td>data 2</td>
<td>data 3</td>
<td>data 4</td>

Right now, if I write a regex like this:

(?:<td>(.+?)<\/td>\s*){4}

I end up with only one capture group, the last one "data 4". Is there a way to use the quantifier and end up with 4 capture groups, or am I forced to write the regex like this to get what I want:

<td>(.+?)<\/td>\s*<td>(.+?)<\/td>\s*<td>(.+?)<\/td>\s*<td>(.+?)<\/td>

Yes, I am well aware that I can hack this simple example up much easier programmatically and then apply and necessary regexes or simpler pattern matches. The data I am working with is far more complex and I would really like to use a regex to handle all of the parsing.

stema
  • 90,351
  • 20
  • 107
  • 135
Tony Lukasavage
  • 1,937
  • 1
  • 14
  • 26
  • 3
    I guess you missed the last paragraph. This is a question of "is something possible with a regex", not "whats the best way to parse html". – Tony Lukasavage May 16 '11 at 13:11
  • I've removed my first comment, but I disagree that the concept here is worth pursuing. Regex is only good for parsing HTML in *very* simple cases. This isn't such. – lonesomeday May 16 '11 at 13:17
  • 2
    Again, this isn't about parsing HTML, its about whether or not a regex can capture multiple groups using quantifiers. This is a simple example to illustrate the point. – Tony Lukasavage May 16 '11 at 13:21
  • 1
    Too bad. [Perl 6 and .NET have the capability](http://stackoverflow.com/questions/2652554/which-regex-flavors-support-captures-as-opposed-to-capturing-groups) to access individual matches in a repeated group, PHP doesn't. – Tim Pietzcker May 16 '11 at 13:55

1 Answers1

9

With php you can use preg_match_all :

$str = '<td>Data 1</td>
<td>data 2</td>
<td>data 3</td>
<td>data 4</td>
';
preg_match_all('/(?:<td>(.+?)<\/td>\s*)/', $str, $m);
print_r($m);

output:

Array
(
    [0] => Array
        (
            [0] => <td>Data 1</td>

            [1] => <td>data 2</td>

            [2] => <td>data 3</td>

            [3] => <td>data 4</td>

        )

    [1] => Array
        (
            [0] => Data 1
            [1] => data 2
            [2] => data 3
            [3] => data 4
        )

)
Toto
  • 89,455
  • 62
  • 89
  • 125
  • I upvoted this because a more complex version of this is what I am doing already. It doesn't answer my question about the regex capture groups with quantifiers though. As I stated in the original content of the question, I would like to avoid programmatic answers to this question and would like to know if its possible from a pure regex perspective. – Tony Lukasavage May 16 '11 at 14:43
  • @Tony Lukasavage: Thanks. Unfortunatly, as Tim Pietzcker said in a comment, it's not possible in php. – Toto May 16 '11 at 14:49