0

I'm familiar with the usual persistent XSS, where content coming from user input should be escaped on the way out to the templates (html entities).

Recently, I encountered a non-persistent one, where a user can just send in the script on the URL where the URL is displayed somewhere on the page. In my case, it was a link tag.

So I have the following link tag that uses the current URL.

<link rel="next" href="{current_url}" />

The problem is when someone sends a link such as:

www.example.com/?%27;alert...

It could be a %27 (single quote) and %22 (double quote) that will close the tag, therefore allowing the user to input scripts, etc.

I know the usual way of preventing XSS would be to use html entities. In this case, won't this break the URL? Is it possible to use url encode instead?

Btw, I'm using PHP and would prefer to use native functions.

gerky
  • 6,267
  • 11
  • 55
  • 82
  • Which templating system are you using here? Does it have a url encode filter? – Gray Jun 30 '15 at 14:06
  • Is this Smarty? [Then you could use a filter](http://www.smarty.net/docsv2/de/language.modifier.escape.tpl) like: `{current_url|escape:'url'}`. – insertusernamehere Jun 30 '15 at 14:13
  • It is 'user input', just in the `url` instead of in 'post' fields etc. You validate it and decide if it is what you expect and is valid? It is never going to get executed? – Ryan Vincent Jun 30 '15 at 14:13
  • Nope it's not smarty. Yep, it is user input and can break the tag and allow the user to add in a script tag. – gerky Jun 30 '15 at 14:26
  • As it is 'user input' , which is always untrusted, then if you always output it encoded with 'htmlentities' or some other encoding that ensures it cannot be executed then will it be safe? – Ryan Vincent Jun 30 '15 at 14:35
  • Man if you are encoding the special characters then it will not break em all. It will render as only. – Vaibs Jul 02 '15 at 12:26

3 Answers3

1

I know you said you preferred native functions, but I have generally been able to find ways to beat most solutions. This library, however, definitely does the job. It is a little slow if you run a ton of executions (> 1000 per request would slow your page down).

http://htmlpurifier.org/

Tony Vance
  • 134
  • 1
1

All content coming from users should be escaped, whether from the URL or from the database. In this case, you'll just do URL encoding instead of HTML entities. It's possible your templating engine is already smart enough to do this for values going into HTML attributes.

nkorth
  • 1,684
  • 1
  • 12
  • 28
  • This answer was the most helpful. I ended up url encoding the query params (using http_build_query for PHP). – gerky Jul 05 '15 at 14:18
0

Like this: Check this answer, it's the one with the function below: XSS filtering function in PHP

 function xss_clean($data)
    {
        /*
         * Function to clean a string to prevent XSS attack.
         */

        // Fix &entity\n;
        $data = str_replace(array('&amp;','&lt;','&gt;'), array('&amp;amp;','&amp;lt;','&amp;gt;'), $data);
        $data = preg_replace('/(&#*\w+)[\x00-\x20]+;/u', '$1;', $data);
        $data = preg_replace('/(&#x*[0-9A-F]+);*/iu', '$1;', $data);

        // decode
        $data = html_entity_decode($data, ENT_COMPAT, 'UTF-8');

        // Remove any attribute starting with "on" or xmlns
        $data = preg_replace('#(<[^>]+?[\x00-\x20"\'])(?:on|xmlns)[^>]*+>#iu', '$1>', $data);

        // Remove javascript: and vbscript: protocols
        $data = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2nojavascript...', $data);
        $data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2novbscript...', $data);
        $data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '$1=$2nomozbinding...', $data);

        // Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
        $data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#i', '$1>', $data);
        $data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#i', '$1>', $data);
        $data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#iu', '$1>', $data);

        // Remove namespaced elements (we do not need them)
        $data = preg_replace('#</*\w+:\w[^>]*+>#i', '', $data);

        do
        {
            // Remove really unwanted tags
            $old_data = $data;
            $data = preg_replace('#</*(?:applet|b(?:ase|gsound|link)|embed|frame(?:set)?|i(?:frame|layer)|l(?:ayer|ink)|meta|object|s(?:cript|tyle)|title|xml)[^>]*+>#i', '', $data);
        }
        while ($old_data !== $data);

        // we are done...
        return $data;
    }
Community
  • 1
  • 1
Mrk Fldig
  • 4,244
  • 5
  • 33
  • 64