0

What is a good url validation function in php to detect xss?

I tried the FILTER_URL function in php, but that still allows urls like:

http://example.com?<script></script>
omma2289
  • 54,161
  • 8
  • 64
  • 68

2 Answers2

1

You could try with these four tests:

// Set the patterns we'll test against
$patterns = array(
    // Match any attribute starting with "on" or xmlns
    '#(<[^>]+[\x00-\x20\"\'\/])(on|xmlns)[^>]*>?#iUu',

    // Match javascript:, livescript:, vbscript: and mocha: protocols
    '!((java|live|vb)script|mocha):(\w)*!iUu',
    '#-moz-binding[\x00-\x20]*:#u',

    // Match style attributes
    '#(<[^>]+[\x00-\x20\"\'\/])style=[^>]*>?#iUu',

    // Match unneeded tags
    '#</*(applet|meta|xml|blink|link|style|script|embed|object|iframe|frame|frameset|ilayer|layer|bgsound|title|base)[^>]*>?#i'
);

And instead of trying to detect XSS attacks, just make sure to use proper sanitizing.

SpencerX
  • 5,453
  • 1
  • 14
  • 21
  • 1
    @zerkms - wouldn't it be nice to explain what is awful and provide your solution, for the benefit of the OP and others seeking answer? – raidenace Jul 27 '14 at 06:53
  • 1
    @raidenace: it's a valid question. Well, it's awful because it makes too much of decisions on what context it will be used in. We all know that XSS (and some other types of vulnerabilities, like sql injections) is a modification of a string that mutates resulting AST. From this point of view - the solution that **always** works (someone don't believe it for some reason though) - is to encode data according the context requirements. That's why this code is terrible. So, the solution that works for everyone in every case - is just to check the input format specification and **encode** data. – zerkms Jul 27 '14 at 06:57
  • @zerkms - that's better. It is helpful to leave critiques along with 'thats just awful' sound bites. – raidenace Jul 27 '14 at 07:01
  • @raidenace: now you see why the solutions are defined and known? Because the input formats are defined and known. – zerkms Jul 27 '14 at 07:01
  • @zerkms - Please don't misconstrue. I only meant your response was better than the first. Nothing more. – raidenace Jul 27 '14 at 07:02
  • @raidenace: oh... So you still think that html standard doesn't define everything one should know to write XSS-free code? Well, nothing to add here then, sorry :-( – zerkms Jul 27 '14 at 07:04
0

You could use htmlspecialchars() and strip_tags() though they will not entirely prevent anything.

Use a combination of these as well as other heuristics. Ideally write your own function where you pass content through a chain of checks. filter_input can be used in it as well.

raidenace
  • 12,789
  • 1
  • 32
  • 35