3

I believe only allowing characters in the range of a-z and A-Z would remove any possibility of an xss attack? I've read a lot about simply putting all output through htmlspecialchars() but it seems that there are some cases in which this is not enough to provide complete protection.

Also, if [a-zA-Z]+ is totally safe, is there any way to accept ' and - characters totally safely to avoid any possible chance of an xss attack? (Since these are the two main characters found in names aside from a-z)

nickhar
  • 19,981
  • 12
  • 60
  • 73
Smithy
  • 385
  • 1
  • 5
  • 15
  • 1
    What exactly are you trying to do? I guess it's got something to do with input validation and RegExes, but you really need to be more clear and specific about what you're trying to achieve, and in which way. – TheWolf Oct 08 '13 at 23:34
  • Limiting to `[a-zA-Z]` should always be okay, but not necessary. Can you provide some detail of instances where `htmlspecialchars()` isn't supposed to be safe? (Or sources saying so) – Pekka Oct 08 '13 at 23:34
  • @Pekka웃 I don't think he's suggesting `htmlspecialchars` would be unsafe, just trying to avoid the chances of forgetting to use it on a specific bit of potentially untrusted output. Which is of course not something to be solved on the input side of things. – Niels Keurentjes Oct 08 '13 at 23:39
  • atk's reply to zerkms' answer on this question: http://stackoverflow.com/questions/3974221/is-there-a-definitive-anti-xss-library-for-php states that if the input was then to be used in javascript orr a url it would no longer be safe if only htmlspecialchars() had been used. – Smithy Oct 08 '13 at 23:40
  • I'm simply trying to take input to be later used on other pages such as a name ect and avoiding all possible chances of xss attacks. I've read the OWASP cheat sheet and thought limiting to a-z would cover all bases? – Smithy Oct 08 '13 at 23:41
  • 1
    It also ensures a user can never enter a mail address or a complex password on your website, or for example a feedback form or URL. You're kind of killing your own freedom as a developer with this approach. – Niels Keurentjes Oct 08 '13 at 23:45

1 Answers1

1

There's 2 sides to this question.

First off: yes of course, if there's no way to 'break out of context' you're eliminating all chances of both XSS and SQL injection exploits. It's impossible to insert either JS or SQL if you can only use alphabetic characters.

Second: it's of course not a real protection, akin to never driving a car again as a failsafe method not to get in accidents. One day or another you are going to have input forms on your site which require other characters to be inserted, and you're going to be screwed. Just writing your code to be fundamentally safe, never trust client input, and properly escape all HTML generated by your code is in the end the only safe route.

What you're trying to do is solve an output problem on the input end, which just doesn't work. If you have arbitrary user input, you'll eventually have people trying to abuse it to do nasty things. Learn how to escape it properly on the output end, use a template system like Twig for output that handles most XSS problems inherently, and use a DAL/ORM like Doctrine or a good parametrized database access API like MySQLi or PDO to avoid SQL injection.

Niels Keurentjes
  • 41,402
  • 9
  • 98
  • 136
  • I'm using pdo for all database access, it's just xss I'm struggling with. So am I right in thinking twig will escape in a similar manner to htmlspecialchars and both are suitable for simply outputting in to html? – Smithy Oct 08 '13 at 23:48
  • 1
    If you enable [automatic escaping](http://twig.sensiolabs.org/doc/api.html#escaper-extension) in Twig you're generally safe yeah, you'd have to specify a `|raw` filter specifically if you **don't** want to escape. – Niels Keurentjes Oct 08 '13 at 23:55