13

I am new to the world of coding and PHP hence would like to learn what's the best way to sanitize form data to avoid malformed pages, code injections and the like. Is the sample script I found below a good example?

Code originally posted at http://codeassembly.com/How-to-sanitize-your-php-input/

/**
 * Sanitize only one variable .
 * Returns the variable sanitized according to the desired type or true/false 
 * for certain data types if the variable does not correspond to the given data type.
 * 
 * NOTE: True/False is returned only for telephone, pin, id_card data types
 *
 * @param mixed The variable itself
 * @param string A string containing the desired variable type
 * @return The sanitized variable or true/false
 */

function sanitizeOne($var, $type)
{       
    switch ( $type ) {
    case 'int': // integer
        $var = (int) $var;
        break;

    case 'str': // trim string
        $var = trim ( $var );
        break;

    case 'nohtml': // trim string, no HTML allowed
        $var = htmlentities ( trim ( $var ), ENT_QUOTES );
        break;

    case 'plain': // trim string, no HTML allowed, plain text
        $var =  htmlentities ( trim ( $var ) , ENT_NOQUOTES )  ;
        break;

    case 'upper_word': // trim string, upper case words
        $var = ucwords ( strtolower ( trim ( $var ) ) );
        break;

    case 'ucfirst': // trim string, upper case first word
        $var = ucfirst ( strtolower ( trim ( $var ) ) );
        break;

    case 'lower': // trim string, lower case words
        $var = strtolower ( trim ( $var ) );
        break;

    case 'urle': // trim string, url encoded
        $var = urlencode ( trim ( $var ) );
        break;

    case 'trim_urle': // trim string, url decoded
        $var = urldecode ( trim ( $var ) );
        break;

    case 'telephone': // True/False for a telephone number
        $size = strlen ($var) ;
        for ($x=0;$x<$size;$x++)
        {
            if ( ! ( ( ctype_digit($var[$x] ) || ($var[$x]=='+') || ($var[$x]=='*') || ($var[$x]=='p')) ) )
            {
                return false;
            }
        }
        return true;
        break;

    case 'pin': // True/False for a PIN
        if ( (strlen($var) != 13) || (ctype_digit($var)!=true) )
        {
            return false;
        }
        return true;
        break;

    case 'id_card': // True/False for an ID CARD
        if ( (ctype_alpha( substr( $var , 0 , 2) ) != true ) || (ctype_digit( substr( $var , 2 , 6) ) != true ) || ( strlen($var) != 8))
        {
            return false;
        }
        return true;
        break;

    case 'sql': // True/False if the given string is SQL injection safe
        //  insert code here, I usually use ADODB -> qstr() but depending on your needs you can use mysql_real_escape();
        return mysql_real_escape_string($var);
        break;
    }       
    return $var;
}
Jochem Kuijpers
  • 1,770
  • 3
  • 17
  • 34
PeanutsMonkey
  • 6,919
  • 23
  • 73
  • 103
  • It does look useful. Albeit the `htmlentities` should be replaced with `htmlspecialchars` and declare the charset parameter. – mario May 02 '11 at 23:26

4 Answers4

18

That script has some nice functions but it doesn't do a good job at sanitizing!

Depending on what you need (and want to accept) you can use:

  • abs() for positive numbers (note that it accepts floats also)

  • preg_replace('/[^a-zA-Z0-9 .-]/','',$var) for cleaning out any special characters from strings or preg_replace('/\D/','',$var) to remove all non-digit characters

  • ctype_* functions eg. ctype_digit($var)

  • filter_var() and filter_input() functions

  • type-cast eg. (int)$_GET['id']

  • convert eg. $id=$_GET['id']+0;

CSᵠ
  • 10,049
  • 9
  • 41
  • 64
10

Your example script isn't great - the so called sanitisation of a string just trims whitespace off each end. Relying on that would get you in a lot of trouble fast.

There isn't a one size fits all solution. You need to apply the right sanitisation for your application, which will completely depend on what input you need and where it's being used. And you should sanitise at multiple levels in any case - most likely when you receive data, when you store it and possibly when you render it.

Worth reading, possible dupes:

What's the best method for sanitizing user input with PHP?

Clean & Safe string in PHP

Community
  • 1
  • 1
Oliver Emberton
  • 951
  • 4
  • 14
  • Thanks. How do you test for all possible scenarios or at least a subset if these? – PeanutsMonkey May 03 '11 at 00:43
  • @PeanutsMonkey - The _approach_ used in your example was fine - i.e. call a universal sanitisation function which only admits in data you specifically define, with a type you specifically requested. That part is good. The problem was the sanitisation needs to be robust enough. Sanitising numbers is easy (something like `$i = (float) $i;`), the problem is usually with strings, as these are vulnerable to both SQL injections and XSS attacks. You really need to know what your strings are being used for to do this though. E.g. must they support foreign characters? – Oliver Emberton May 03 '11 at 10:42
7

As a general rule if you are using PHP & MySQL you will want to sanitize data going into MySQL like so:

$something = mysql_real_escape_string($_POST['your_form_data']);

http://php.net/manual/en/function.mysql-real-escape-string.php

David
  • 224
  • 3
  • 16
4

It's not bad.

For SQL, it'd be best to avoid the need to risk the scenario at all, by using PDO to insert parameters into your queries.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • You could achieve the same with mysqli. If your only ever going to use MySQL this might be slightly better. [mysqli prepared statements](http://php.net/manual/en/mysqli.quickstart.prepared-statements.php) – Lightbulb1 Jun 05 '15 at 12:06