1

I'm conducting quite an extensive list of pre-database-insert filters and I'm getting quite bummed out about how long and ugly the code is:

/*******************************************************************
* START OF sanitising input 
********************************************************************/
// main user inputs
$title  = filter_var($place_ad['title'], FILTER_SANITIZE_STRING); 
$desc   = filter_var($place_ad['desc'], FILTER_SANITIZE_SPECIAL_CHARS);
$cat_1  = filter_var($place_ad['cat_1'], FILTER_SANITIZE_NUMBER_INT);
$cat_2  = filter_var($place_ad['cat_2'], FILTER_SANITIZE_NUMBER_INT);
$cat_3  = filter_var($place_ad['cat_3'], FILTER_SANITIZE_NUMBER_INT);
$price  = filter_var($place_ad['price'], FILTER_SANITIZE_NUMBER_FLOAT,FILTER_FLAG_ALLOW_FRACTION);
$suffix = filter_var($place_ad['suffix'], FILTER_SANITIZE_STRING); 

// check input
if(empty($title) || strlen($title) < 3 || strlen($title) > 100) { $error[] = 'Title field empty, too long or too short.'; }
if(empty($desc) || strlen($desc) < 3 || strlen($place_ad['desc']) > 5000) { $error[] = 'Description field empty, too long or too short.'; } 
if(empty($cat_1) || empty($cat_2)) { $error[] = 'You did not select a category for your listing.'; }
if(empty($price) || $price < 0 || $price > 1000000) { $error[] = 'Price field empty, too low or too high.'; }


// google location stuff
$lat    = filter_var($place_ad['lat'], FILTER_SANITIZE_NUMBER_FLOAT,FILTER_FLAG_ALLOW_FRACTION);    
$lng    = filter_var($place_ad['lng'], FILTER_SANITIZE_NUMBER_FLOAT,FILTER_FLAG_ALLOW_FRACTION);
$formatted_address   = filter_var($place_ad['formatted_address'], FILTER_SANITIZE_STRING);

// check input
if(empty($lat) || empty($lng)) { $error[] = 'Location error. No co-ordinates for your location.'; }


// account type
$registered = filter_var($place_ad['registered'], FILTER_SANITIZE_NUMBER_INT);


// money making extras
$extras = filter_var($place_ad['extras'], FILTER_SANITIZE_NUMBER_INT); //url encoded string
$icons  = filter_var($place_ad['icons'], FILTER_SANITIZE_STRING); //url encoded string
$premium= filter_var($place_ad['premium'], FILTER_SANITIZE_NUMBER_INT); //numeric float;
$bump   = filter_var($place_ad['bump'], FILTER_SANITIZE_NUMBER_INT); //numeric float;


// user details field
if ($registered == '1') // Registering as new user
{

    $type   = filter_var($place_ad['n_type'], FILTER_SANITIZE_NUMBER_INT);
    $name   = filter_var($place_ad['n_name'], FILTER_SANITIZE_STRING);
    $phone  = filter_var($place_ad['n_phone'], FILTER_SANITIZE_STRING);
    $email  = filter_var($place_ad['n_email'], FILTER_SANITIZE_EMAIL);
    $pass   = filter_var($place_ad['n_password'], FILTER_UNSAFE_RAW);

    if(empty($type)) { $error[] = 'Type field error.'; }
    if(empty($name) || strlen($name) > 100) { $error[] = 'You did not enter your name or name too long.'; }
    if(empty($email) || strlen($email) < 5 || strlen($email) > 100) { $error[] = 'You did not enter a valid email.'; }
    if(!filter_var($email, FILTER_VALIDATE_EMAIL)) { $error[] = 'You did not enter a valid email.'; }
    if(empty($pass) || strlen($pass) < 6 || strlen($pass) > 100) { $error[] = 'Your password must be at least 6 characters.'; }

}
elseif ($registered =='2') // registered user
{
    $email  = filter_input($place_ad['n_email'], FILTER_SANITIZE_EMAIL);
    $pass   = filter_input($place_ad['n_password'], FILTER_UNSAFE_RAW);

    if(empty($email) || strlen($email) < 5 || strlen($email) > 100) { $error[] = 'You did not enter a valid email.'; }
    if(empty($pass) || strlen($pass) < 6 || strlen($pass) > 100) { $error[] = 'Your password must be at least 6 characters.'; }
}
elseif ($registered == '3') // dont wanna register details
{
    $name   = filter_input($place_ad['n_name'], FILTER_SANITIZE_STRING);
    $phone  = filter_input($place_ad['n_phone'], FILTER_SANITIZE_STRING);
    $email  = filter_input($place_ad['n_email'], FILTER_SANITIZE_EMAIL);

    if(empty($name) || strlen($name) > 100) { $error[] = 'You did not enter your name or name too long.'; }
    if(empty($email) || strlen($email) < 5 || strlen($email) > 100) { $error[] = 'You did not enter a valid email.'; }

}
/*******************************************************************
* END OF Sanitising input
********************************************************************/

I'm thinking that a lot of my code is 'unnecessary' but I think it might be bad coding practice if I were to remove it.

For example, I could ditch all of the FILTER_SANITIZE_NUMBER* filters as the database is set up correctly with INT/FLOAT fields.

I could also ditch a lot of the 'greater than >' checks as most of those are only there to prevent the user from inputting massive amounts of data (yet again this will be limited by the database field length).

Does everyone else have user input validation code this ugly?

------------------EDIT----------------------

Thanks very much for the info. As I'm using PDO I think I might try to compress it down a bit more, but can I ask the following:

  1. With input fields such as radio buttons and select boxes where the user isn't easily able to corrupt the input, would you consider that just binding PDO Constants is enough? These values are tied to enum and tinyint(1) fields in the db and manipulating these values outside of the form spec wont allow the user to achieve anything.
  2. I am also using filter_var to make the user input suitable for displaying on a UTF8 encoded page. I believe this really only encodes <> & an a couple of other characters to their entities. Would I be better just using htmlentities?
Souad
  • 4,856
  • 15
  • 80
  • 140
James
  • 656
  • 2
  • 10
  • 24
  • 1
    You are doing two things there, sanitization and value constraining. The elaborateness comes strictly from not using utility code. Many form handlers or validation libraries allow to specify rulesets for that, e.g. `array("title"=>["string"], "cat_1"=>["int", "<100"])`, reducing the overhead. And in fact databases can apply such validation as well. Prepared statements make the `filter_var` calls redundant often. Personally I'm using `$_POST->int["cat_1"]` or `$_GET->text->ascii["title"]` for such. Basically, just use a library for enhancing readability or automating such tasks. – mario Mar 21 '13 at 23:14
  • Your method is really interesting. Is that just a built-in way of casting values using OOP rather than $value = (int)$value, or a custom library you're using? – James Mar 22 '13 at 08:05

2 Answers2

2

IMO, even though it appears unwieldy, the apparent excessiveness of all the validation is necessary at the app layer IF YOU WANT to notify the user, and have them correct the entered data. Otherwise you can let the Data Access Layer (PDO for example), sanitize data by assigning PDO constants when binding values for query.

You can make the sanitize process as complex or as easy as you want, you just need to figure out which configuration suits your needs the best.

---- Update -----

A few suggestions to make the code not so overwhelming. This assumes you cannot load a vendor library like Zend or Symfony Validator Component.

class Validator
{
    public function validateString($string, array $options)
    {
        $min = ((isset($options['min'])) && (strlen($options['min'] > 1))) ? $options['min'] : null;
        $max = ((isset($options['max'])) && (strlen($options['max'] > 1))) ? $options['max'] : null;

        $string = filter_var($string, FILTER_SANITIZE_STRING);

        if (empty($string)) {
            throw new Exception(sprintf('Empty value: %s', $string));
        }

        if ((false === is_null($min) && (strlen($string) < $min)) {
            throw new Exception(sprintf('Value too short: %s', $string));
        }

        if ((false === is_null($max) && (strlen($string) > $max)) {
            throw new Exception(sprintf('Value too long: %s', $string));
        }
    }

    return $string;
}

// Calling code

try {
    $title = Validator::validateString($place_ad['title'], array('min' => 3, 'max' => 100));
} catch (Exception $e) {
    $errors[] = 'Title field empty, too long or too short.';
    // OR
    $errors[] = $e->getMessage();
}

try {
    $title = Validator::validateString($place_ad['desc'], array('min' => 3, 'max' => 5000));
} catch (Exception $e) {
    $errors[] = 'Description field empty, too long or too short.';
    // OR
    $errors[] = $e->getMessage();
}

Hopefully you get the picture. By making the methods generic to the datatypes, and flexible with an array of options, you can re-use code to reduce your footprint, but still maintain the level of data sanitization you wish to achieve.

Mike Purcell
  • 19,847
  • 10
  • 52
  • 89
-2

Assuming the information will be displayed on a website (as part of HTML), there are two things you need to do:

  1. Sanitize for database queries (Or use prepared statements)
  2. Sanitize for HTML display.

The first is easy, if you use mysqli or PDO regularly, you should already know how to do so. Here's a question on the subject: PHP PDO prepared statements.

The second is trickier in your case. You need to go through each variable, and pass it through a cleaning function. Be it htmlspecialchars or a more aggressive HTML filter such as HTML Purifier. Here's a question on the subject: How to strip specific tags and specific attributes from a string?.

If you have the values in an array, you can use array_walk or array_map to sanitize the entire array in a few lines.

Community
  • 1
  • 1
Madara's Ghost
  • 172,118
  • 50
  • 264
  • 308
  • 2
    This doesn't really answer the question. The OP isn't just sanitizing DB inserts or HTML, he's trying to actually do quality-control on inputs: make sure certain inputs are numbers, certain ones are strings, certain ones have non-empty values, certain ones are formatted as emails, etc.... – Ben D Mar 21 '13 at 22:54