9

I see the oft-repeated comment "always use prepared queries to protect against SQL injection attacks".

What is the practical difference between using a prepared query and a constructed query where user input is always sanitized?

Constructed

function quote($value) {
  global $db; 
  return "'" . mysqli_real_escape_string($db, $value) . "'";
}

$sql = "INSERT INTO foo (a, b) VALUES (" . quote($a) . "," . quote($b) . ")";

Prepared

$stmt = mysqli_prepare($db, "INSERT INTO foo (a, b) VALUES (?, ?)");
mysqli_stmt_bind_param($stmt, "ss", $a, $b);

Aside from verbosity and style, what reasons would I want to use one over the other?

Andy Jones
  • 6,205
  • 4
  • 31
  • 47

3 Answers3

7

An excellent question.

Escaping

Escaping is the more traditional way to do things. It adds backslashes to any unsafe data so if someone tries to pass a direct SQL statement in, it will pass harmlessly instead of being taken literally. The catch here is that there are some edge cases where an attacker could still get around the escaping function. The vast majority of these involve arcane tricks like changing character sets and such. The main thing to understand, if you pursue escaping, is that it is not sure fire bet for security in all cases.

Prepared Statements

Prepared statements are not subject to the same flaws because you're breaking your query up into two parts. The first part contains the query and the second contains the parameters for that query. As a result, MySQL can treat the parameters differently, in a safe way that will not allow injection (at least nobody has yet found a way).

If you need to run the same query over and over with different values, prepared statements save you a lot of time. MySQL stores the statement in memory so multiple execution is very fast.

You can't parameterize some parts of your query however (i.e. table names).

Caveat Emptor

So prepared statements are 100% secure, right? Why not just use prepared statements for everything? We've solved SQL injection FOREVER! Well, not quite. There's some caveats you need to know about first.

The first is specific to mysqli. You will want to have the MySQL Native Driver (mysqlnd) installed to do prepared statements that return results in mysqli. Specifically, mysqli_stmt_get_result, which returns a result set just like mysqli_query. Some shared hosts and older servers are still using the older MySQL Client Libraries. If you want to know which you're using, run phpinfo() and look for the mysqlnd block. mysqlnd is a good idea even if you're not using prepared statements because it's built by the PHP team specifically to make the most out of MySQL. Some people cannot change the driver, however, as this is a server configuration level change.

The second is for those who use PDO (another way to connect to your database in PHP). Now, PDO has a much better way of handling prepared statements (it has a handy aliasing system) and does NOT require mysqlnd to do prepared statements. The caveat here is that PDO, by default, only emulates prepared statements, meaning that if you don't configure it properly, all you're doing is using a library that will do the escaping for you. See this page under PDO::ATTR_EMULATE_PREPARES for more info.

The third one is by far the most important and you need to pay close attention here because this could have an impact on your program if you're not careful. Most people who advocate prepared statements tend to neglect the largest pitfall here, which is you're doing more queries on your database (here's the MySQL 5.5 manual on how to do them via the CLI). Now, for INSERT and UPDATE statements where you're going to be running the same query over and over there's no comparison. Prepared statements save you a LOT of time over their concatenated counterparts. But for SELECT, it becomes a lot less clear. Take my example above. It's a simple pull of a record via the autonumber field. But it takes 2 queries to run now, instead of one. If this is an internal application with low traffic that might not be a big deal but if that page gets 100,000 views a day, you might be looking to lower the overhead on that page some. If so, building a straight up query could be a solution for you.

So, in short

mysqli_real_escape_string has some pitfalls but can be made "safe enough" for single queries as long as you understand those pitfalls and don't treat it as the only security between you and SQL injection.

mysqli_prepare is ALWAYS secure, but also does more queries so it may not be as efficient for your database and/or program. Some queries should ALWAYS be prepared, while some probably shouldn't be.

Other solutions can also useful instead of using either solution

  • Typecasting (has some limitations but nothing that with make for SQL unsafe)
  • Whitelisting (where you ensure, via PHP, what values will be acceptable and only allow those values)

In the end, it's up to you to figure out where each is appropriate and where it might be better to use the other.

HamZa
  • 14,671
  • 11
  • 54
  • 75
Machavity
  • 30,841
  • 27
  • 92
  • 100
2

Prepared queries are sent to the SQL server separate from the parameters, meaning that they only need to be compiled/optimized once if executed multiple times (e.g. with varying parameters). This can be quite significant with large datasets.

Other than that, they are functionally identical provided the input is actually properly escaped in all cases when using non-prepared queries.

That said, prepared queries are a lot less prone to oversights, usually resulting in better maintainability.

Edit: Also check out @eggyal's comment as to why prepared statements are always injection-safe as opposed to escaped query parameters.

jwueller
  • 30,582
  • 4
  • 66
  • 70
  • 4
    Agreed, but note that because the parameters of prepared statements are *never* parsed for SQL by the server, it is *impossible* for SQL to be injected through them; whereas escaping methods [have been](http://vigilance.fr/vulnerability/MySQL-SQL-injection-via-multi-byte-characters-5885) (and [remain subject to](http://stackoverflow.com/a/12118602)) exploitable vulnerabilities that can, in some obscure edge cases, lead to successful injection attacks. – eggyal Dec 09 '13 at 00:20
  • @eggyal: Exactly. This is why they are only functionally identical if - in fact - properly escaped, as stated. Doing it properly is usually not an easy task, though. – jwueller Dec 09 '13 at 00:21
  • And here I thought that PDO was going to save us all @eggyal as per reading http://stackoverflow.com/a/12118602 (Under **The Bad**) – Funk Forty Niner Dec 09 '13 at 00:52
  • 3
    Downvoter, please explain yourself so I can improve. – jwueller Dec 09 '13 at 06:47
  • I don't understand this kind of demands. If you *can* improve - why don't you just improve? – Your Common Sense Dec 09 '13 at 06:53
  • 2
    If you spotted a mistake that I might have missed, it would obviously be nice to know, as it would likely help me to prevent the same mistake in the future. – jwueller Dec 09 '13 at 07:02
-2

Good question. Many people indeed overlook the real difference.

There are two serious flaws with "constructing".

First, "constructing" you are talking about is manual. Means it's open to two threats:

  • it makes formatting detachable. A smart programmer would be always tempted to "optimize" the code... resulting in moving repeated quote() calls somewhere else, earlier in the code, to automate the process. Which will lead straight to disaster, as in a big project there is a very big chance to overlook a variable or two, and leave them unformatted.
  • manual formatting always depends on the programmer's skill and mood. Some will do it all right and some will fail. You cannot guarantee proper formatting in case it's manual.

Yet placeholder guarantees that. As long as every variable goes into query via placeholder, you can consider yourself safe. This is why it said: "always use prepared queries to guard against SQL injection attacks", no matter if it's native prepared statements or just emulated. The very idea of using a placeholder that matters.

Second, SQL syntax is not limited to strings only. Say, your quote() function won't work with LIMIT clause parameters. So you'll have to devise something else (or simply let injection in).

Means you have to have a placeholder for the every kind of SQL literal. Here is my attempt to accomplish the goal - a library that lets you to have placeholders for all the usual data types.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345