1

I want a preg_match code that will detect a given string and get its wrapping element. I have a string and a html code like:

$string = "My text";
$html = "<div><p class='text'>My text</p><span>My text</span></div>";

So i need to create a function that will return the element wrapping the string like:

$element = get_wrapper($string, $html);

function get_wrapper($str, $code){
    //code here that has preg_match and return the wrapper element
}

The returned value will be array since it has 2 possible returning values which are <p class='text'></p> and <span></span>

Anyone can give me a regex pattern on how to get the HTML element that wraps the given string?

Thanks! Answers are greatly appreciated.

PHP Noob
  • 1,597
  • 3
  • 24
  • 34
  • [using `preg_match` to parse HTML is bad bad bad idea](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – xkeshav Jul 31 '12 at 05:37
  • Try http://php.net.dom and http://php.net/domxpath. if anyone suggests using a regex, beat them with a dead fish, and then go look at those two links. – Marc B Jul 31 '12 at 05:38
  • @diEcho: can you give me a regex pattern that detects the given string ? thanks – PHP Noob Jul 31 '12 at 05:44

3 Answers3

0

It's bad idea use regex for this task. You can use DOMDocument

$oDom = new DOMDocument('1.0', 'UTF-8');
$oDom->loadXML("<div>" . $sHtml ."</div>");
get_wrapper($s, $oDom);

after recursively do

function get_wrapper($s, $oDom) {
    foreach ($oDom->childNodes AS $oItem) {
        if($oItem->nodeValue == $s) {
            //needed tag - $oItem->nodeName
        }
        else {
            get_wrapper($s, $oItem);    
        }
    }
}
0

The simple pattern would be the following, but it assumes a lot of things. Regexes shouldn't be used with these. You should look at something like the Simple HTML DOM parser which is more intelligent.

Anyway, the regex that would match the wrapper tags and surrounding html elements is as follows.

 /[A-Za-z'= <]*>My text<[A-Za-z\/>]*/g
Anirudh Ramanathan
  • 46,179
  • 22
  • 132
  • 191
0

Even if regex is never the correct answer in the domain of dom parsing, I came out with another (quite simple) solution

<[^>/]+?>My String</.+?>

if the html is good (ie it has closing tags, < is replaced with < & so on). This way you have in the first regex group the opening tag and in the second the closing one.

Gabber
  • 5,152
  • 6
  • 35
  • 49