61

This is to create a community learning resource. The goal is to have examples of good code that do not repeat the awful mistakes that can so often be found in copy/pasted PHP code. I have requested it be made Community Wiki.

This is not meant as a coding contest. It's not about finding the fastest or most compact way to do a query - it's to provide a good, readable reference especially for newbies.

Every day, there is a huge influx of questions with really bad code snippets using the mysql_* family of functions on Stack Overflow. While it is usually best to direct those people towards PDO, it sometimes is neither possible (e.g. inherited legacy software) nor a realistic expectation (users are already using it in their project).

Common problems with code using the mysql_* library include:

  • SQL injection in values
  • SQL injection in LIMIT clauses and dynamic table names
  • No error reporting ("Why does this query not work?")
  • Broken error reporting (that is, errors always occur even when the code is put into production)
  • Cross-site scripting (XSS) injection in value output

Let's write a PHP code sample that does the following using the mySQL_* family of functions:

  • Accept two POST values, id (numeric) and name (a string)
  • Do an UPDATE query on a table tablename, changing the name column in the row with the ID id
  • On failure, exit graciously, but show the detailed error only in production mode. trigger_error() will suffice; alternatively use a method of your choosing
  • Output the message "$name updated."

And does not show any of the weaknesses listed above.

It should be as simple as possible. It ideally doesn't contain any functions or classes. The goal is not to create a copy/pasteable library, but to show the minimum of what needs to be done to make database querying safe.

Bonus points for good comments.

The goal is to make this question a resource that a user can link to when encountering a question asker who has bad code (even though it isn't the focus of the question at all) or is confronted with a failing query and doesn't know how to fix it.

To pre-empt PDO discussion:

Yes, it will often be preferable to direct the individuals writing those questions to PDO. When it is an option, we should do so. It is, however, not always possible - sometimes, the question asker is working on legacy code, or has already come a long way with this library, and is unlikely to change it now. Also, the mysql_* family of functions is perfectly safe if used properly. So no "use PDO" answers here please.

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088

5 Answers5

12

My stab at it. Tried to keep it as simple as possible, while still maintaining some real-world conveniences.

Handles unicode and uses loose comparison for readability. Be nice ;-)

<?php

header('Content-type: text/html; charset=utf-8');
error_reporting(E_ALL | E_STRICT);
ini_set('display_errors', 1);
// display_errors can be changed to 0 in production mode to
// suppress PHP's error messages

/*
Can be used for testing
$_POST['id'] = 1;
$_POST['name'] = 'Markus';
*/

$config = array(
    'host' => '127.0.0.1', 
    'user' => 'my_user', 
    'pass' => 'my_pass', 
    'db' => 'my_database'
);

# Connect and disable mysql error output
$connection = @mysql_connect($config['host'], 
    $config['user'], $config['pass']);

if (!$connection) {
    trigger_error('Unable to connect to database: ' 
        . mysql_error(), E_USER_ERROR);
}

if (!mysql_select_db($config['db'])) {
    trigger_error('Unable to select db: ' . mysql_error(), 
        E_USER_ERROR);
}

if (!mysql_set_charset('utf8')) {
    trigger_error('Unable to set charset for db connection: ' 
        . mysql_error(), E_USER_ERROR);
}

$result = mysql_query(
    'UPDATE tablename SET name = "' 
    . mysql_real_escape_string($_POST['name']) 
    . '" WHERE id = "' 
    . mysql_real_escape_string($_POST['id']) . '"'
);

if ($result) {
    echo htmlentities($_POST['name'], ENT_COMPAT, 'utf-8') 
        . ' updated.';
} else {
    trigger_error('Unable to update db: ' 
        . mysql_error(), E_USER_ERROR);
}
Markus Hedlund
  • 23,374
  • 22
  • 80
  • 109
  • 1
    @Pekka Aw crap, fixed it. Thanks! – Markus Hedlund Jun 01 '11 at 09:14
  • 5
    -1 for `#error_reporting(~E_ALL); ... to disable error output` - bad advice. Can't downvote more, else I would -1 for `@` operator. – OZ_ Jun 01 '11 at 09:14
  • @OZ_ I can see your point about @, but why is turning off error reporting in production mode bad? – Pekka Jun 01 '11 at 09:17
  • @OZ_: Why is this an issue? In production mode there is no harm in disabling error output, as long as you log it somewhere else. And using @ is the only way to hide nasty output from `mysql_connect`. Please reconsider. – Markus Hedlund Jun 01 '11 at 09:18
  • @Pekka: What's wrong with @? :O – Markus Hedlund Jun 01 '11 at 09:18
  • @Znarkus `@` always suppresses error output, which is viewed as bad. What I don't understand is the criticism of `error_reporting(~E_ALL);` – Pekka Jun 01 '11 at 09:19
  • @Pekka I just hide the output, but still handle any errors, and later show my own error output if enabled. – Markus Hedlund Jun 01 '11 at 09:25
  • 3
    @Pekka, @Znarkus, `error_reporting(0);` it's very bad advice, because not always errors will be handled by the error handler, because not always this handler will be even coded. `ini_set('display_errors',0)` - this variant can be used in production code, to hide text of errors, and only when errors are handled. – OZ_ Jun 01 '11 at 09:26
  • @OZ_ you're right of couse, duh. Good point. – Pekka Jun 01 '11 at 09:26
  • @OZ_ Changed to `display_errors`. – Markus Hedlund Jun 01 '11 at 09:56
  • Hmm, shouldn't an `id` which I assume is an integer NOT be quoted in the query (you are making MySQL work harder casting it), but just used with `intval`? – Wrikken Jan 15 '12 at 01:00
  • @Wrikken Even though `id` mostly are integers, they don't need to be, and many times are not. – Markus Hedlund Jan 15 '12 at 16:26
  • @Znarkus: I know, however, this is meant to be a reference for best practices. It would not hurt to show that on of those is NOT to quote numerical values, but instead cast them to their desired to type. – Wrikken Jan 15 '12 at 16:30
7

I decided to jump the gun and just put something up. It's something to start with. Throws an exception on error.

function executeQuery($query, $args) {
    $cleaned = array_map('mysql_real_escape_string', $args);

    if($result = mysql_query(vsprintf($query, $cleaned))) {
        return $result;
    } else {
        throw new Exception('MySQL Query Error: ' . mysql_error());
    }
}

function updateTablenameName($id, $name) {
    $query = "UPDATE tablename SET name = '%s' WHERE id = %d";

    return executeQuery($query, array($name, $id));
}

try {
    updateTablenameName($_POST['id'], $_POST['name']);
} catch(Exception $e) {
    echo $e->getMessage();
    exit();
}
Aaron
  • 829
  • 2
  • 9
  • 15
  • 3
    Althought it works, it's waaaay too complex for newbies and copy-pasting-people. – Carlos Campderrós Jun 01 '11 at 08:40
  • 1
    @Aaron - have you read the question and comments? read them properly, just Copy-Pasting is not the solution to this question. – Sujit Agarwal Jun 01 '11 at 08:41
  • 1
    Regarding: _On failure, exits graciously, but shows the detailed error **only** in production mode._ Maybe add some conditional arround the `echo $e->getMessage();`? – Yoshi Jun 01 '11 at 08:45
  • @Coding-Freak -- not sure if you're implying that I copied and pasted this from somewhere else. If you are, I didn't. – Aaron Jun 01 '11 at 08:46
  • @Carlos - the exception might be a tad much, and the executeQuery abstraction could be unnecessary if it's just an example. Aside from that, I think it serves the purpose without being overly complex or verbose. – Aaron Jun 01 '11 at 08:48
  • 1
    @Aaron - I dint mention it for your answer, rather I wanted it for others to follow. I can understand its ur own answer. – Sujit Agarwal Jun 01 '11 at 08:49
  • I hope I didn't violate some unspoken rule or SO etiquette by posting this prematurely. If I did, I'll remove. – Aaron Jun 01 '11 at 08:53
  • @Aaron I think the `executeQuery` is far from unnecessary even if this is _just_ an example, as what the function does is something which should allways be done, and as such has it's purpose. – Yoshi Jun 01 '11 at 08:53
  • @Aaron - You have not broken any rule of SO, but it is a kind of silent decision amongst the SO peers to make this question and its answer a valuable source of information, and to do that, all were deciding by means of comments and getting accustomed to the problems that need to be resolved. – Sujit Agarwal Jun 01 '11 at 08:58
  • 2
    Err, wouldn't this assign the $id value to name and $name to id? Apart from the fact that you're calling updateTableName when your function is called updateTablenameName :p. – wimvds Jun 01 '11 at 11:02
  • @wimvds, thanks. Both issues have been fixed. – Aaron Jun 01 '11 at 21:29
3
/**
 * Rule #0: never trust users input!
 */

//sanitize integer value
$id = intval($_GET['id']);
//sanitize string value;
$name = mysql_real_escape_string($_POST['name']);
//1. using `dbname`. is better than using mysql_select_db()
//2. names of tables and columns should be quoted by "`" symbol
//3. each variable should be sanitized (even in LIMIT clause)
$q = mysql_query("UPDATE `dbname`.`tablename` SET `name`='".$name."' WHERE `id`='".$id."' LIMIT 0,1 ");
if ($q===false)
{
    trigger_error('Error in query: '.mysql_error(), E_USER_WARNING);
}
else
{
    //be careful! $name contains user's data, remember Rule #0
    //always use htmlspecialchars() to sanitize user's data in output
    print htmlspecialchars($name).' updated';
}

########################################################################
//Example, how easily is to use set_error_handler() and trigger_error()
//to control error reporting in production and dev-code
//Do NOT use error_reporting(0) or error_reporting(~E_ALL) - each error
//should be fixed, not muted
function err_handler($errno, $errstr, $errfile, $errline)
{
    $hanle_errors_print = E_ALL & ~E_NOTICE;

    //if we want to print this type of errors (other types we can just write in log-file)
    if ($errno & $hanle_errors_print)
    {
        //$errstr can contain user's data, so... Rule #0
        print PHP_EOL.'Error ['.$errno.'] in file '.$errfile.' in line '.$errline
              .': '.htmlspecialchars($errstr).PHP_EOL;
    }
    //here you can write error into log-file
}

set_error_handler('err_handler', E_ALL & ~E_NOTICE & E_USER_NOTICE & ~E_STRICT & ~E_DEPRECATED);

And some explanation of comments:

//1. using `dbname`. is better than using mysql_select_db()

With using mysql_select_db you can create errors, and it will be not so easy to find and fix them.
For example, in some script you will set db1 as database, but in some function you need to set db2 as database.
After calling this function, database will be switched, and all following queries in script will be broken or will broke some data in wrong database (if names of tables and columns will coincide).

//2. names of tables and columns should be quoted by "`" symbol 

Some names of columns can be also SQL-keywords, and using "`" symbol will help with that.
Also, all string-values, inserted to query, should be quoted by ' symbol.

//always use htmlspecialchars() to sanitize user's data in output
It will help you to prevent XSS-attacks.

OZ_
  • 12,492
  • 7
  • 50
  • 68
  • Nice, I like it! I also like the error handler, but maybe it's too much for a newbie to grasp here - would it be an option to remove it for now, or move it to the bottom as "optional"? (I edited out the `i` in `mysqli`) – Pekka Jun 01 '11 at 08:57
  • feel free to edit mistakes in my broken English :) – OZ_ Jun 01 '11 at 08:57
  • @Pekka I will try to take more focus on mysql_, will edit now. – OZ_ Jun 01 '11 at 08:59
  • I would replace the string concatenation with `sprintf`. This makes stuff so much more readable, and possible syntax errors (missing quotes and stuff) more visible. – Yoshi Jun 01 '11 at 09:04
  • @Yoshi I think it's personal preferences and it's definitely not more simple. – OZ_ Jun 01 '11 at 09:07
2
<?  
mysql_connect(); 
mysql_select_db("new"); 
$table = "test"; 
if($_SERVER['REQUEST_METHOD']=='POST') {
  $name = mysql_real_escape_string($_POST['name']); 
  if ($id = intval($_POST['id'])) { 
    $query="UPDATE $table SET name='$name' WHERE id=$id"; 
  } else { 
    $query="INSERT INTO $table SET name='$name'"; 
  } 
  mysql_query($query) or trigger_error(mysql_error()." in ".$query); 
  header("Location: http://".$_SERVER['HTTP_HOST'].$_SERVER['PHP_SELF']);  
  exit;  
}  
if (!isset($_GET['id'])) {
  $LIST=array(); 
  $query="SELECT * FROM $table";  
  $res=mysql_query($query); 
  while($row=mysql_fetch_assoc($res)) $LIST[]=$row; 
  include 'list.php'; 
} else {
  if ($id=intval($_GET['id'])) { 
    $query="SELECT * FROM $table WHERE id=$id";  
    $res=mysql_query($query); 
    $row=mysql_fetch_assoc($res); 
    foreach ($row as $k => $v) $row[$k]=htmlspecialchars($v); 
  } else { 
    $row['name']=''; 
    $row['id']=0; 
  } 
  include 'form.php'; 
}  
?>

form.php

<? include 'tpl_top.php' ?>
<form method="POST">
<input type="text" name="name" value="<?=$row['name']?>"><br>
<input type="hidden" name="id" value="<?=$row['id']?>">
<input type="submit"><br>
<a href="?">Return to the list</a>
</form>
<? include 'tpl_bottom.php' ?>

list.php

<? include 'tpl_top.php' ?>
<a href="?id=0">Add item</a>
<? foreach ($LIST as $row): ?>
<li><a href="?id=<?=$row['id']?>"><?=$row['name']?></a>
<? endforeach ?>
<? include 'tpl_bottom.php' ?>
Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
0

Looks like my other answer missed the aim of the question.
(this one doesn't meet some requirements either, but as it can be seen, no safe solution can be achieved without implementing a function to process placeholders, which are being the cornerstone of the safe queries)

So, here is another attempt to post concise solution to make mysql queries safe yet handy.

A function I wrote long time ago and it served me well until I moved to the corporative standard OOP-based solution.
There was 2 goals to pursue for: security and ease of use.

First one achieved by implementing placeholders.
Second one achieved by implementing placeholders and different result types.

The function surely not ideal one. Some drawbacks are:

  • no % chars have to be placed in the query directly as it's using printf syntax.
  • no multiple connections supported.
  • no placeholder for the identifiers (as well as many other handy placeholders).
  • again, no identifier placeholder!. "ORDER BY $field" case have to be handled manually!
  • of course an OOP implementation would be much more flexible, having neat distinct methods instead ugly "mode" variable as well other necessary methods.

Yet it is good, safe and concise, no need to install a whole library.

function dbget() {
  /*
  usage: dbget($mode, $query, $param1, $param2,...);
  $mode - "dimension" of result:
  0 - resource
  1 - scalar
  2 - row
  3 - array of rows
  */
  $args = func_get_args();
  if (count($args) < 2) {
    trigger_error("dbget: too few arguments");
    return false;
  }
  $mode  = array_shift($args);
  $query = array_shift($args);
  $query = str_replace("%s","'%s'",$query); 

  foreach ($args as $key => $val) {
    $args[$key] = mysql_real_escape_string($val);
  }

  $query = vsprintf($query, $args);
  if (!$query) return false;

  $res = mysql_query($query);
  if (!$res) {
    trigger_error("dbget: ".mysql_error()." in ".$query);
    return false;
  }

  if ($mode === 0) return $res;

  if ($mode === 1) {
    if ($row = mysql_fetch_row($res)) return $row[0];
    else return NULL;
  }

  $a = array();
  if ($mode === 2) {
    if ($row = mysql_fetch_assoc($res)) return $row;
  }
  if ($mode === 3) {
    while($row = mysql_fetch_assoc($res)) $a[]=$row;
  }
  return $a;
}
?>

usage examples

$name = dbget(1,"SELECT name FROM users WHERE id=%d",$_GET['id']);
$news = dbget(3,"SELECT * FROM news WHERE title LIKE %s LIMIT %d,%d",
              "%$_GET[search]%",$start,$per_page);

As it can be seen from the above examples, the main difference from all the codes ever posted in Stackoverflow, both safety and data retrieval routines are encapsulated in the function code. So, no manual binding, escaping/quoting or casting, as well as no manual data retrieval.

combined with other helper function

function dbSet($fields,$source=array()) {
  $set = '';
  if (!$source) $source = &$_POST;
  foreach ($fields as $field) {
    if (isset($source[$field])) {
      $set.="`$field`='".mysql_real_escape_string($source[$field])."', ";
    }
  }
  return substr($set, 0, -2); 
}

used like this

$fields = explode(" ","name surname lastname address zip phone regdate");
$_POST['regdate'] = $_POST['y']."-".$_POST['m']."-".$_POST['d'];
$sql = "UPDATE $table SET ".dbSet($fields).", stamp=NOW() WHERE id=%d";
$res = dbget(0,$sql, $_POST['id']);
if (!$res) {
  _503;//calling generic 503 error function
}

it may cover almost every need, including the example case from the OP.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345