The general consensus is that in a PHP+MySQL application you should:
- Validate your input (e.g. emails should be formatted as emails) (e.g. using filter_var)
- Add/Update/Read/Delete your data avoiding SQL injection (e.g. using PDO class)
- Escape your output output
This question concerns the last part
I'd like to construct a simple, unified PHP function that is defined as follows:
function safe_output($var,$context) { ... }
The context might be the HTML body, an HTML attribute, a URL, etc.
Here's some examples:
<h2>HTML_BODY test</h2>
<div><?php echo(safe_output($content,'HTML_BODY')); ?></div>
<h2>HTML_BODY test (textarea)</h2>
<textarea><?php echo(safe_output($content,'HTML_BODY')); ?></textarea>
<h2>HTML_ATTR test</h2>
<div data-html_attr="<?php echo(safe_output($content,'HTML_ATTR')); ?>">See source for data tag</div>
<h2>HTML_ATTR test (input)</h2>
<input type="text" value="<?php echo(safe_output($content,'HTML_ATTR')); ?>">
<h2>URL test</h2>
<a href="<?php echo(safe_output($content,'URL')); ?>">Link</a>
<h2>JS VAR test</h2>
<script>x = <?php echo(safe_output($content,'JS_VAR')); ?></script>
So far, I have the following:
function safe_output($var,$context = 'HTML_BODY') {
if(! in_array($context,array('HTML_BODY','HTML_ATTR','URL','JS_VAR'))) { return false;}
switch($context) {
case 'HTML_BODY':
case 'HTML_ATTR':
return htmlspecialchars($var, ENT_QUOTES, 'UTF-8');
case 'URL':
return urlencode($var);
case 'JS_VAR':
return json_encode($var);
}
}
I've based this on the OWASP XSS Cheatsheet, selecting the parts relevant to my application (for example, I'm not using user-defined CSS).
- Should I use
htmlentities()
for HTML attributes (seems to be suggested by OWASP, but I can't find a practical example of why) - Is there anything that could cause problems I haven't considered?
- Is there a better way to do the whole thing?