2

I want to run htmlentities() only on the contents within <code> things to strip </code>

I Wrote a function that takes a string and finds the content in between <code> </code>

function parse_code($string) {

        // test the string and find contents with <code></code>
        preg_match('@<code>(.*?)</code>@s', $string, $matches);

            // return the match if it succeeded
            if($matches) {
                return $matches[1];
            }else {
                return false;
            }
    }

However I need some help with a function that will will actually run htmlentities(); on the contents within <code> </code> and then implode() it all back together. For example lets say we have the string below.

<div class="myclass" rel="stuff"> things in here </div>
<code> only run htmlentites() here so strip out things like < > " ' & &lt/code>
<div> things in here </div>

Once again, The function would need to keep all contents of the string the same but only modify and run htmlentities() on the contents of <code> </code>

kr1zmo
  • 837
  • 3
  • 13
  • 30
  • I suggest you read this - http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Phil Feb 14 '11 at 05:33

1 Answers1

5

You can simplify this with using a custom callback function:

$html = preg_replace_callback(
     "@  (?<= <code>)  .*?  (?= </code>)  @six",
     "htmlentities_cb", $html
);

function htmlentities_cb($matches) {
    return htmlentities($matches[0], ENT_QUOTES, "UTF-8");
}

The syntax for matching the enclosing code tags is called lookbehind and lookahead assertion. It simplifies the callback and avoids implode() later, because the assertion matches itself do not become part of $matches[0]. While @six is for case-insensitve tag matches and allows whitespace in the regex to make it more readable.

mario
  • 144,265
  • 20
  • 237
  • 291
  • I'm going to test this out, and if it worked, I love you I love you I love you. – kr1zmo Feb 14 '11 at 05:22
  • @kr1zmo: Tested it, but better give it a try with your code. Also don't forget the extra htmlentities parameters (like I did) for better security. – mario Feb 14 '11 at 05:25
  • so how exactly would I run this if I have my string $query = ''; that has all the html and code tag. – kr1zmo Feb 14 '11 at 05:25
  • @kr1zmo: Just use `$query` in place of `$html =` and `, $html` in the preg_replace invocation. – mario Feb 14 '11 at 05:27