2

I have the following code which I'm using to parse HTML Code and only leave UL and LI tags:

function strip_tags_content($text, $tags = '', $invert = FALSE) 
{ 

    preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags); 
    $tags = array_unique($tags[1]); 

    if(is_array($tags) AND count($tags) > 0) 
    { 
        if($invert == FALSE) 
        { 
            return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text); 
        } 
        else 
        { 
            return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text); 
        } 
    } 
    elseif($invert == FALSE) 
    { 
        return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text); 
    } 
    return $text; 
}

echo strip_tags_content($html, "<ul><li>");

This works perfectly fine and I get the following return:

<ul><li class="myLi">Item 1</li><li class="myLi">Item 2</li><li class="myLi">Item 3</li></ul>

What I want to do next is to assign a variable:

$myList = strip_tags_content($html, "<ul><li>");

And then for each li value, push the content within that li tag in an array, so I finish with an array containing 3 items: Item 1, Item 2 and Item 3.

But have no idea how to finish this last part. Can someone help please?

ndm
  • 59,784
  • 9
  • 71
  • 110
maltdev
  • 53
  • 1
  • 3
  • 1
    [Don't use regex to parse HTML](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags). – GrumpyCrouton Apr 18 '18 at 12:46

1 Answers1

0

I would do a str_replace() on it:

$myList = str_replace(array('<li class="myLi">','<ul>','</ul>'), "",  $StringYouHaveAlready);

You will be left with just your Items and </li> in your string.

Then, you can explode on the </li>:

$myListArray = explode($myList, "</li>");

You should then be left with your array, as you have asked.

Adam J
  • 506
  • 3
  • 17