0

Reqex Question: Extract words

lucky
charms

from string:

<a>lucky <b>charms</b></a>

My attempt:

preg_match_all("/<(.*)>(.*)<\/(.*)>+/is", $text, $matches);
print_r($matches);

Result:

Array
(
    [0] => Array
        (
            [0] => <a>lucky <b>charms</b></a>
        )

    [1] => Array
        (
            [0] => a>lucky <b>charms</b
        )

    [2] => Array
        (
            [0] => 
        )

    [3] => Array
        (
            [0] => a
        )

)
Flip Booth
  • 271
  • 3
  • 11
  • And what's the problem? Regex too greedy? Make it less greedy. – mario Feb 25 '13 at 22:59
  • possible duplicate of [Why does this regex match too much? (Doesn't stop at slash)](http://stackoverflow.com/questions/8100746/why-does-this-regex-match-too-much-doesnt-stop-at-slash) – mario Feb 25 '13 at 23:00
  • Have you tried using `strip_tags()`? – Tchoupi Feb 25 '13 at 23:01
  • @Mathieu Imbert I don't want to strip tags. Because I will be replacing lucky and charms later on.. and still need the tags – Flip Booth Feb 25 '13 at 23:03
  • You could always make a copy of it, then use strip tags on the copy. Then you'd get your lucky charms, but not lose your original tags – starshine531 Feb 25 '13 at 23:19

3 Answers3

0

If you always have that structure, you can use:

preg_match("#<(.*?)>(.*?)<(.*?)>(.*?)</\\3></\\1>#is", '<a>lucky <b>charms</b></a>', $matches);

Which $matches contains:

array(5) {
  [0]=>
  string(26) "lucky charms"
  [1]=>
  string(1) "a"
  [2]=>
  string(6) "lucky "
  [3]=>
  string(1) "b"
  [4]=>
  string(6) "charms"
}
nickb
  • 59,313
  • 13
  • 108
  • 143
0

Your regex is not suitable, because * is gready use ? to make * non-gready or change your regex in something like this

<([^>]+)>(.*?)</\1>
Philipp
  • 15,377
  • 4
  • 35
  • 52
0

How about everything in between a closing tag and the next opening tag

preg_match_all("/\>([^\<]+)\</is", $text, $matches);

Then the matches you want are in $matches[1]

Array
(
    [0] => Array
        (
            [0] => >lucky <
            [1] => >charms<
        )
    [1] => Array
        (
            [0] => lucky 
            [1] => charms
        )
)
Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109