2

This is not only PHP problem as far as I can tell, but I am asking here about PHP (PHP7 actually).

Consider such simple regex (if this is unclear -- it is an example):

/((\w+): (\d+))+/

and text for it:

foo: 2008bar: 2009

The match is over entire text, the problem is the sub-captures are used and forgotten as soon as the regex engine advances over the text. As the result you will get only the last captures.

I would like to get all valid (correct) captures, so entire history, no only the last captures.

Here is the code to test it:

<?php

$str = 'foo: 2008bar: 2009';

preg_match_all('/((\w+): (\d+))+/', $str, $matches);

print_r($matches);

?>

And here is the output

Array
(
    [0] => Array
        (
            [0] => foo: 2008bar: 2009
        )

    [1] => Array
        (
            [0] => bar: 2009
        )

    [2] => Array
        (
            [0] => bar
        )

    [3] => Array
        (
            [0] => 2009
        )

)

As you can see entire text was matched, but for the captures only the last ones were stored. Those are missing:

foo: 2008
foo
2008

Thus my question: how to get entire "history" of the captures?

greenoldman
  • 16,895
  • 26
  • 119
  • 185

1 Answers1

0

For this task, \G (continue escape sequence) wears a body-length cape and has xray vision. ;)

It allows you to match from the start of the string OR from where the pattern last finished.

Code: (Demo)

$str = 'foo: 2008bar: 2009';
var_export(
    preg_match_all(
        '~\G(\w+): (\d+)~',
        $str,
        $out
    )
    ? $out
    : 'no matches'
);

Output:

array (
  0 => 
  array (
    0 => 'foo: 2008',
    1 => 'bar: 2009',
  ),
  1 => 
  array (
    0 => 'foo',
    1 => 'bar',
  ),
  2 => 
  array (
    0 => '2008',
    1 => '2009',
  ),
)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136