0

I'm trying to match html tag name along with it's attributes. In the example below, I am trying to match div, class, style and id.

$html='<div class="nav" style="float:left;" id="navigation">';
preg_match_all("/(([^<]\w+\s)|(\S+)=)/", $html, $match);

This returns the array like below.

As you can see, the correct results are kept in Array[2] and Array [3]. I was wondering if it is possible to put the results in a single array, perhaps in Array[1]? Not sure how to do this.

Array
(
[0] => Array
    (
        [0] => div 
        [1] => class=
        [2] => style=
        [3] => id=
    )

[1] => Array
    (
        [0] => div 
        [1] => class=
        [2] => style=
        [3] => id=
    )

[2] => Array
    (
        [0] => div 
        [1] => 
        [2] => 
        [3] => 
    )

[3] => Array
    (
        [0] => 
        [1] => class
        [2] => style
        [3] => id
    )

)
user1448031
  • 2,172
  • 11
  • 44
  • 89

1 Answers1

3

You can use this simple regex :

(?<=<)\w++|\b\w++(?==)

where (?<=...) is a lookbehind and (?=...) a lookahead

example:

preg_match_all('~(?<=<)\w++|\b\w++(?==)~', $html, $matches);
print_r($matches);

But if you use several capturing parenthesis and you want the result in an unique array, you can use the branch reset feature. Example (without lookarounds):

preg_match_all('~(?|<(\w++)|\b(\w++)=)~', $html, $matches);

(about the ++, it is a possessive quantifier that informs the regex engine that it doesn't need to backtrack (among other things, backtrack positions are not recorded), this increase performances of the pattern but this is not essential (in particular for small strings). You can have more information about this feature here and here)

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Great. That seems to give a result that I am after. I'm not sure why you've used double `+` as in `\w++`. I tried `(?<=<)\w+|\w+(?==)` and it seems to give the same result. Maybe there is something I am not understanding. – user1448031 Nov 16 '13 at 16:35
  • 1
    @user1448031: I have added some reference about this strange thing. – Casimir et Hippolyte Nov 16 '13 at 16:41