Removing – josh3736 May 13 '11 at 03:27

  • 1
    @Sourav, re stealing cookies: If I could insert `` into a page, I've successfully stolen the cookies of anyone who loads that page. (This is mitigated by the server setting a login cookie as [HttpOnly](http://www.codinghorror.com/blog/2008/08/protecting-your-cookies-httponly.html), but that obviously depends on the site's configuration.) – josh3736 May 13 '11 at 03:32
  • 3

    Probably the simplest method would be str_ireplace() for case-insensitive replacement, however this won't preserve the case of the "sCriPt" word. But if you're out to de-fang XSS attacks that may be just fine:

    str_ireplace("<script>", "&lt;script&gt;", $input);
    

    A more complex solution could be devised with preg_replace() to preserve case, but would be slower. This might work, but if it were me I'd use str_ireplace()...

    preg_replace("/<(script)>/i", "&lt;$1&gt;", $input);
    

    Note: If it is XSS prevention you're after, neither of these takes into account things like <script type=text/javascript>. To truly handle these cases, you need to load the HTML string into DOMDocument and delete the offending script nodes.

    Michael Berkowski
    • 267,341
    • 46
    • 444
    • 390
    • [Do not use a regex to parse HTML.](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – josh3736 May 13 '11 at 03:18
    • @josh3736 I agree, and as in my answer I wouldn't use `preg_replace`, but in this instance it isn't even being used as a regex. Instead it's just a shortcut to preserve case in a simple string replacement. Anyway, disclaimer added above. – Michael Berkowski May 13 '11 at 03:23
    • `you need to load the HTML string into DOMDocument and delete the offending script nodes` how to do that ! – Sourav May 13 '11 at 03:29
    0

    Is there any reason you can't use htmlspecialchars()?

    Sean Walsh
    • 8,266
    • 3
    • 30
    • 38