1

The general consensus is that in a PHP+MySQL application you should:

  1. Validate your input (e.g. emails should be formatted as emails) (e.g. using filter_var)
  2. Add/Update/Read/Delete your data avoiding SQL injection (e.g. using PDO class)
  3. Escape your output output

This question concerns the last part

I'd like to construct a simple, unified PHP function that is defined as follows:

function safe_output($var,$context) { ... }

The context might be the HTML body, an HTML attribute, a URL, etc.

Here's some examples:

<h2>HTML_BODY test</h2>
<div><?php echo(safe_output($content,'HTML_BODY')); ?></div>

<h2>HTML_BODY test (textarea)</h2>
<textarea><?php echo(safe_output($content,'HTML_BODY')); ?></textarea>

<h2>HTML_ATTR test</h2>
<div data-html_attr="<?php echo(safe_output($content,'HTML_ATTR')); ?>">See source for data tag</div>

<h2>HTML_ATTR test (input)</h2>
<input type="text" value="<?php echo(safe_output($content,'HTML_ATTR')); ?>">

<h2>URL test</h2>
<a href="<?php echo(safe_output($content,'URL')); ?>">Link</a>

<h2>JS VAR test</h2>
<script>x = <?php echo(safe_output($content,'JS_VAR')); ?></script>

So far, I have the following:

function safe_output($var,$context = 'HTML_BODY') {
    if(! in_array($context,array('HTML_BODY','HTML_ATTR','URL','JS_VAR'))) {  return false;}
    switch($context) {
        case 'HTML_BODY':
        case 'HTML_ATTR':
            return htmlspecialchars($var, ENT_QUOTES, 'UTF-8');
        case 'URL':
            return urlencode($var);
        case 'JS_VAR':
            return json_encode($var);

    }
}

I've based this on the OWASP XSS Cheatsheet, selecting the parts relevant to my application (for example, I'm not using user-defined CSS).

  • Should I use htmlentities() for HTML attributes (seems to be suggested by OWASP, but I can't find a practical example of why)
  • Is there anything that could cause problems I haven't considered?
  • Is there a better way to do the whole thing?
Ben
  • 4,707
  • 5
  • 34
  • 55
  • For your HTML_ATTR you can add a strip_tags() because it's not allowed in attribute. But the best practice it's to define a whitelist of attributes you authorized – Inazo Apr 10 '20 at 09:02
  • Thanks @inazo, wouldn't htmlspecialchars deal with the tags anyway, because they'd be encoded? I may well want to allow the user to use '<' and '>', for example if the string content is someone explaining how to use some HTML code. – Ben Apr 10 '20 at 09:07
  • you should sanitize and validate data when you are receiving them , so you won't be worried later when you want to output them . the point is to avoid invalid data to enter your database – Iman Emadi Apr 10 '20 at 09:26
  • 1
    @BrightFaith Sometimes it doesn't make it to the database. Might be a simple form submission to the next page. – El_Vanja Apr 10 '20 at 09:30
  • 3
    @BrightFaith, that's not the general consensus and isn't always practical anyway. For example, see https://security.stackexchange.com/questions/95325/input-sanitization-vs-output-sanitization, https://stackoverflow.com/questions/129677/how-can-i-sanitize-user-input-with-php etc. the concept isn't without merit, but isn't going to solve all problems and will cause several more. – Ben Apr 10 '20 at 09:31
  • there are some rules in the OWASP XSS Cheetsheet that say in some cases even if your data is escaped you should not output them , except them , you can use different methods in php , using `php filter_var` and many others ... i have a function with some different options for sanitizing user inputs , [see if this can be used in your case or not](https://stackoverflow.com/a/61049460/9578875) – Iman Emadi Apr 10 '20 at 09:41

0 Answers0