1

I am looking for suitable replacement code that allows me replace the content inside of any HTML tag that has a certain class e.g.

$class = "blah";
$content = "new content";
$html = '<div class="blah">hello world</div>';

// code to replace, $html now looks like:
// <div class="blah">new content</div>

Bare in mind that:

  1. It wont necessarily be a div, it could be <h2 class="blah">
  2. The class can have more than one class and still needs to be replaced e.g. <div class="foo blah green">hello world</div>

I am thinking regular expressions should be able to do this, if not I am open to other suggestions such as using the DOM class (although I would rather avoid this if possible because it has to be PHP4 compatible).

Justin Johnson
  • 30,978
  • 7
  • 65
  • 89
fire
  • 21,383
  • 17
  • 79
  • 114

3 Answers3

1

Do not use regular expressions to parse HTML. You can use the built in DOMDocument, or something like simple_html_dom:

require_once("simple_html_dom.php");

$class = "blah";
$content = "new content";
$html = '<div class="blah">hello world</div>';

$doc = new simple_html_dom();
$doc->load($html);

foreach ( $doc->find("." . $class) as $node ) {
    $node->innertext = $content;
}

Sorry, I didn't see the PHP4 requirement. Here's a solution using the standard DOMDocument as mentioned above.

function DOM_getElementByClassName($referenceNode, $className, $index=false) {
    $className = strtolower($className);
    $response  = array();

    foreach ( $referenceNode->getElementsByTagName("*") as $node ) {
        $nodeClass = strtolower($node->getAttribute("class"));

        if (
                $nodeClass == $className || 
                preg_match("/\b" . $className . "\b/", $nodeClass)
            ) {
            $response[] = $node;
        }
    }

    if ( $index !== false ) {
        return isset($response[$index]) ? $response[$index] : false;
    }

    return $response;
}

$doc = new DOMDocument();
$doc->loadHTML($html);

foreach ( DOM_getElementByClassName($doc, $class) as $node ) {
    $node->nodeValue = $content;
}

echo $doc->saveHTML();
Community
  • 1
  • 1
Justin Johnson
  • 30,978
  • 7
  • 65
  • 89
-1

If you are sure that $html is valid HTML code, you could use a HTML parser or even XML parser if it's valid XML code.

But the quick and dirty way in Regex would be something like:

$html = preg_replace('/(<[^>]+ class="[^>]*' . $class . '[^"]*"[^>]*>)[^<]+(<\/[^>]+>)/siU', '$1' . $content . '$2', $html);

Didn't test it too much, but it should work. Tell me if you find cases where it doesn't. ;)

Edit: Added "and dirty"... ;)

Edit 2: New version of the RegEx:

<?php

$class = "blah";
$content = "new content";
$html = '<div class="blah test"><h1><span>hello</span> world</h1></div><div class="other">other content</div><h2 class="blah">remove this</h2>';

$html = preg_replace('/<([\w]+)(\s[^>]*class="[^"]*' . $class . '[^"]*"[^>]*>).+(<\/\\1>)/siU', '<$1$2' . $content . '$3', $html);

echo $html;

?>

The last problem left is if theres a class that only has "blah" in its name, like "tooMuchBlahNow". Let's see how we can address that. Btw: Is it obvious already that I love playing with RegEx? ;)

b_i_d
  • 180
  • 5
  • @b-i-d this seems to work for alphanumeric characters but not if there is HTML inside of the tag...? – fire Jun 04 '10 at 09:27
  • Yeah, that wasn't in the specs of the question. For HTML in the tag the RegEx has to be a little more complicated. Let me think about it... – b_i_d Jun 04 '10 at 21:23
  • Updated my answer. Only one "problem" left. – b_i_d Jun 04 '10 at 22:03
-2

There is no need to use the DOM class, this would probably be done quickest using jQuery, as Khnle said, or you could use the preg_replace() function. Give me some time, I may write a quick regex for you.

But I would recommend using something like jQuery, this way you can serve the page up to the user quickly and allow their computer to do the processing instead of your server.

Brendan
  • 9
  • 2
  • And what if the user has disabled JavaScript? It's never a good idea to do something in JS if you can do it on the server as the server is way faster that a client might be. – 2ndkauboy Jun 03 '10 at 17:15
  • Javascript and jQuery are not the solution for everything. – Matt Jun 03 '10 at 17:16
  • + the whole point is I can't use jQuery it has to be server side! – fire Jun 04 '10 at 08:14