136

This is what I've read so far about PDO::ATTR_EMULATE_PREPARES:

  1. PDO's prepare emulation is better for performance since MySQL's native prepare bypasses the query cache.
  2. MySQL's native prepare is better for security (preventing SQL Injection).
  3. MySQL's native prepare is better for error reporting.

I don't know how true any of these statements are anymore. My greatest concern in choosing a MySQL interface is preventing SQL Injection. The second concern is performance.

My application currently uses procedural MySQLi (without prepared statements), and utilizes the query cache quite a bit. It will rarely re-use prepared statements in a single request. I started the move to PDO for the named parameters and security of prepared statements.

I'm using MySQL 5.1.61 and PHP 5.3.2

Should I leave PDO::ATTR_EMULATE_PREPARES enabled or not? Is there a way to have both the performance of the query cache and the security of prepared statements?

Community
  • 1
  • 1
Andrew Ensley
  • 11,611
  • 16
  • 61
  • 73
  • 3
    Honestly? Just keep using MySQLi. If it's already working using prepared statements under that, PDO is basically a pointless layer of abstraction. *EDIT*: PDO is really useful for green field applications where you're not sure what database is going into the back-end. – jmkeyes Apr 12 '12 at 01:59
  • 1
    Sorry, my question was unclear before. I've edited it. The application does not use prepared statements in MySQLi at the moment; just mysqli_run_query(). From what I've read, MySQLi prepared statements also bypass the query cache. – Andrew Ensley Apr 12 '12 at 14:13

7 Answers7

121

To answer your concerns:

  1. MySQL >= 5.1.17 (or >= 5.1.21 for the PREPARE and EXECUTE statements) can use prepared statements in the query cache. So your version of MySQL+PHP can use prepared statements with the query cache. However, make careful note of the caveats for caching query results in the MySQL documentation. There are many kinds of queries which cannot be cached or which are useless even though they are cached. In my experience the query cache isn't often a very big win anyway. Queries and schemas need special construction to make maximum use of the cache. Often application-level caching ends up being necessary anyway in the long run.

  2. Native prepares doesn't make any difference for security. The pseudo-prepared statements will still escape query parameter values, it will just be done in the PDO library with strings instead of on the MySQL server using the binary protocol. In other words, the same PDO code will be equally vulnerable (or not-vulnerable) to injection attacks regardless of your EMULATE_PREPARES setting. The only difference is where the parameter replacement occurs--with EMULATE_PREPARES, it occurs in the PDO library; without EMULATE_PREPARES, it occurs on the MySQL server.

  3. Without EMULATE_PREPARES you may get syntax errors at prepare-time rather than at execute-time; with EMULATE_PREPARES you will only get syntax errors at execution time because PDO doesn't have a query to give to MySQL until execution time. Note that this affects the code you will write! Especially if you are using PDO::ERRMODE_EXCEPTION!

An additional consideration:

  • There is a fixed cost for a prepare() (using native prepared statements), so a prepare();execute() with native prepared statements may be a little slower than issuing a plain textual query using emulated prepared statements. On many database systems the query plan for a prepare() is cached as well and may be shared with multiple connections, but I don't think MySQL does this. So if you do not reuse your prepared statement object for multiple queries your overall execution may be slower.

As a final recommendation, I think with older versions of MySQL+PHP, you should emulate prepared statements, but with your very recent versions you should turn emulation off.

After writing a few apps that use PDO, I've made a PDO connection function which has what I think are the best settings. You should probably use something like this or tweak to your preferred settings:

/**
 * Return PDO handle for a MySQL connection using supplied settings
 *
 * Tries to do the right thing with different php and mysql versions.
 *
 * @param array $settings with keys: host, port, unix_socket, dbname, charset, user, pass. Some may be omitted or NULL.
 * @return PDO
 * @author Francis Avila
 */
function connect_PDO($settings)
{
    $emulate_prepares_below_version = '5.1.17';

    $dsndefaults = array_fill_keys(array('host', 'port', 'unix_socket', 'dbname', 'charset'), null);
    $dsnarr = array_intersect_key($settings, $dsndefaults);
    $dsnarr += $dsndefaults;

    // connection options I like
    $options = array(
        PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
        PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC
    );

    // connection charset handling for old php versions
    if ($dsnarr['charset'] and version_compare(PHP_VERSION, '5.3.6', '<')) {
        $options[PDO::MYSQL_ATTR_INIT_COMMAND] = 'SET NAMES '.$dsnarr['charset'];
    }
    $dsnpairs = array();
    foreach ($dsnarr as $k => $v) {
        if ($v===null) continue;
        $dsnpairs[] = "{$k}={$v}";
    }

    $dsn = 'mysql:'.implode(';', $dsnpairs);
    $dbh = new PDO($dsn, $settings['user'], $settings['pass'], $options);

    // Set prepared statement emulation depending on server version
    $serverversion = $dbh->getAttribute(PDO::ATTR_SERVER_VERSION);
    $emulate_prepares = (version_compare($serverversion, $emulate_prepares_below_version, '<'));
    $dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, $emulate_prepares);

    return $dbh;
}
rink.attendant.6
  • 44,500
  • 61
  • 101
  • 156
Francis Avila
  • 31,233
  • 6
  • 58
  • 96
  • 29
    Re #2: surely values that MySQL receives as parameters (to native prepared statements) do not get parsed for SQL **at all**? So the risk of injection *must* be lower than using PDO's prepare emulation, where any flaw in escaping (e.g. the historical issues `mysql_real_escape_string` had with multi-byte characters) would still leave one open to injection attacks? – eggyal May 04 '12 at 20:56
  • 1
    #1: Awesome! This is completely new information to me, and I'm very happy about it. The app does utilize the query cache quite a bit (over 50% of all queries are retrieved straight from the cache). #2: I'd have to agree with eggyal. That's the whole point of moving to prepared statements is for the security of MySQL escaping the content itself rather than PHP guessing how the content should be escaped. #3: I actually prefer knowing there's an issue before the query is executed, so that's fine. Great answer. Thanks for your help! – Andrew Ensley May 04 '12 at 21:35
  • 3
    @eggyal, you are making assumptions about how prepared statements are implemented. PDO may have a bug in its emulated prepares escaping, but MySQL might have bugs too. AFAIK, no problems have been discovered with emulated prepares which could cause parameter literals to pass through unescaped. – Francis Avila May 04 '12 at 21:39
  • 2
    Awesome answer, but I have a question: If you turn EMULATION off, won't the execution be slower ? PHP would have to send the prepared statement to MySQL for validation and only then send the parameters. So, if you use the prepared statement 5 times, PHP will talk to MySQL 6 times (instead of 5). Won't this make it slower ? Besides, I think there's a greater chance that PDO could have bugs in the validation process, rather than MySQL... – Radu Murzea Apr 08 '13 at 12:14
  • The only way to know is to test but in my experience it's usually faster because 1) The total amount of data sent back-and-forth is smaller (5 whole queries vs 1 query and 5 sets of data) and 2) the query plan is calculated only once instead of five times. As for PDO vs MySQL code quality, on what basis do you prefer one over the other? Neither of them are exactly known as stellar, bulletproof systems! – Francis Avila Apr 08 '13 at 14:14
  • 7
    Note the points made in [this answer](http://stackoverflow.com/a/12118602) re prepared statement emulation using `mysql_real_escape_string` under the hood and the consequent vulnerabilities that can arise (in very particular edge cases). – eggyal Jun 12 '13 at 13:22
  • @RaduMurzea The difference is normally negligible. When performing repeated queries, like multiple updates, or inserts, turning off emulation is faster but only by a small margin. The other case would be for external RDBMS's where there would be a high latency or large amounts of traffic (1000+ connections). At that point though you lean more toward clustering and stored procedures+triggers rather than multiple application based calls to the server. – Will B. Sep 25 '13 at 05:15
  • 6
    +1 Good answer! But for the record, if you use native prepare, parameters are never escaped or combined into the SQL query even on the MySQL server side. By the time you execute and supply parameters, the query has been parsed and transformed into internal data structures in MySQL. Read this blog by a MySQL optimizer engineer that explains this process: http://guilhembichot.blogspot.com/2014/05/re-factoring-some-internals-of-prepared.html I'm not saying this means native prepare is better, insofar as we trust the PDO code to do escaping correctly (which I do). – Bill Karwin Jul 09 '14 at 23:06
  • 2
    This kind of posts is actually the best answer to the question "Why SO is that valuable learning resource for all developers/programmers". Classic definition how good answers should look alike. Simply brilliant. Thanks to Francis. – Wh1T3h4Ck5 Dec 30 '14 at 22:52
  • 2
    So despite acknowledging that emulating prepares can radically change the application behaviour, you provide a "function" that arbitrarily switches mode depending on MySQL version. So anyone foolish enough to use your "function", then upgrading MySQL, suddenly breaks their application in many mysterious and hard-to-trace ways. Brilliant. – Quolonel Questions Apr 05 '19 at 09:15
  • 4
    "Native prepares doesn't make any difference for security. The pseudo-prepared statements will still escape query parameter values, it will just be done in the PDO library with strings instead of on the MySQL server using the binary protocol." This is either a complete and dangerous misrepresentation or MySQL is very poorly implemented. That is not how parameterized queries are supposed to be implemented; the parameters are supposed to bypass the parsing phase entirely, preventing them from ever being interpreted as part of SQL code. So which is it: dangerously wrong or incorrect? – jpmc26 Sep 20 '19 at 03:59
  • Someone should check if ATTR_EMULATE_PREPARES when off is able to avoid round trip times perhaps by buffering up the query, binds, execute to send all at once but I don't think it does. See if prepare with bad SQL syntax fails on prepare or execute. – jgmjgm Feb 24 '20 at 17:53
26

I'm surprised no one has mentioned one of the biggest reasons to turn off emulation. With emulation on, PDO returns all integers and floats as strings. When you turn off emulation, integers and floats in MySQL become integers and floats in PHP.

For more information, see the accepted answer for this question: PHP + PDO + MySQL: how do I return integer and numeric columns from MySQL as integers and numerics in PHP?.

dallin
  • 8,775
  • 2
  • 36
  • 41
  • 9
    This is true in PHP versions < `8.1`. However, beginning in 8.1, emulated prepares will be compatible with native prepares with regards to properly returning integers and floats. See the [PHP 8.1 Upgrade Guide](https://github.com/php/php-src/blob/7b34db0659dda933b1146a0ff249f25acca1d669/UPGRADING#L130) for more information. – Tim Martin Aug 29 '21 at 21:38
11

Beware on disabling PDO::ATTR_EMULATE_PREPARES (turning native prepares on) when your PHP pdo_mysql is not compiled against mysqlnd.

Because old libmysql is not fully compatible with some functions, it can lead to strange bugs, for example:

  1. Losing most significant bits for 64bit integers when binding as PDO::PARAM_INT ( 0x12345678AB will be cropped to 0x345678AB on 64bit machine )
  2. Inability to make simple queries like LOCK TABLES ( it throws SQLSTATE[HY000]: General error: 2030 This command is not supported in the prepared statement protocol yet exception )
  3. Need to fetch all rows from result or close cursor before next query ( with mysqlnd or emulated prepares it automatically does this work for you and doesn't go out of sync with mysql server )

These bugs I figured out in my simple project when migrated to other server which used libmysql for pdo_mysql module. Maybe there are much more bugs, I don't know. Also I tested on fresh 64bit debian jessie, all listed bugs occur when I apt-get install php5-mysql, and disappear when I apt-get install php5-mysqlnd.

When PDO::ATTR_EMULATE_PREPARES is set to true (as default) - these bugs don't happen anyway, because PDO doesn't use prepared statements at all in this mode. So, if you use pdo_mysql based on libmysql ("mysqlnd" substring does't appear in "Client API version" field of pdo_mysql section in phpinfo) - you should not turn PDO::ATTR_EMULATE_PREPARES off.

Sage Pointer
  • 536
  • 6
  • 13
9

I would turn off emulate prepares as you're running 5.1 which means PDO will take advantage of the native prepared statement functionality.

PDO_MYSQL will take advantage of native prepared statement support present in MySQL 4.1 and higher. If you're using an older version of the mysql client libraries, PDO will emulate them for you.

http://php.net/manual/en/ref.pdo-mysql.php

I ditched MySQLi for PDO for the prepared named statements and the better API.

However, to be balanced, PDO performs negligibly slower than MySQLi, but it's something to bear in mind. I knew this when I made the choice, and decided that a better API and using the industry standard was more important than using a negligibly faster library that ties you to a particular engine. FWIW I think the PHP team is also looking favourably at PDO over MySQLi for the future too.

Will Morgan
  • 4,470
  • 5
  • 29
  • 42
  • Thank you for that information. How has not being able to use the query cache impacted your performance or were you even using it before? – Andrew Ensley May 04 '12 at 14:10
  • I can't say as the framework I'm using caches on multiple levels anyway. You can always explicitly use SELECT SQL_CACHE though. – Will Morgan May 04 '12 at 16:37
  • Did not even know there was a SELECT SQL_CACHE option. However, it appears that that still wouldn't work. From the docs: "The query result is cached **if it is cacheable**..." http://dev.mysql.com/doc/refman/5.1/en/query-cache-in-select.html – Andrew Ensley May 04 '12 at 17:25
  • Yes. That depends on the nature of the query, rather than the platform specifics. – Will Morgan May 07 '12 at 22:31
  • I read that to mean "The query result is cached *unless something else prevents it from being cacheable*," which - from what I had read until then - included prepared statements. However, thanks to Francis Avila's answer, I know that that is no longer true for my version of MySQL. – Andrew Ensley May 14 '12 at 19:21
8

I'd recommend enabling real database PREPARE calls as the emulation doesn't catch everything.., for example, it will prepare INSERT;!

var_dump($dbh->prepare('INSERT;'));
$dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
var_dump($dbh->prepare('INSERT;'));

The output

object(PDOStatement)#2 (1) {
  ["queryString"]=>
  string(7) "INSERT;"
}
bool(false)

I'll gladly take a performance hit for code that actually works.

FWIW

PHP Version: PHP 5.4.9-4ubuntu2.4 (cli)

MySQL Version: 5.5.34-0ubuntu0

quickshiftin
  • 66,362
  • 10
  • 68
  • 89
  • 1
    It's an interesting point. I guess emulation postpones server-side parsing to the execution phase. While it isn't a big deal (wrong SQL will eventually fail) it's cleaner to let `prepare` do the job it's supposed to. (Additionally, I've always assumed that the client-side parameter parser will necessarily have bugs of its own.) – Álvaro González Jul 15 '14 at 10:54
  • 1
    IDK if you're interested, but [here's a little writeup](https://quickshiftin.com/blog/2014/07/batching-requests-database-migrations-php-pdo/) on some other spurious behavior I noticed with PDO that lead me down this rabbit hole to begin with. Seems the handling of multiple queries is lacking. – quickshiftin Jul 15 '14 at 16:44
  • I just looked at some migrations libraries on GitHub... What do you know, [this one](https://github.com/brtriver/dbup/blob/master/src/Dbup/Application.php) pretty much does the exact same thing as my blog post. – quickshiftin Jul 15 '14 at 20:13
6

Why switch emulation to ‘false’?

The main reason for this is that having the database engine do the prepare instead of PDO is that the query and the actual data are sent separately, which increases security. This means when the parameters are passed to the query, attempts to inject SQL into them are blocked, since MySQL prepared statements are limited to a single query. That means that a true prepared statement would fail when passed a second query in a parameter.

The main argument against using the database engine for the prepare vs PDO is the two trips to the server – one for the prepare, and another for the parameters to get passed – but I think the added security is worth it. Also, at least in the case of MySQL, query caching has not been an issue since version 5.1.

https://tech.michaelseiler.net/2016/07/04/dont-emulate-prepared-statements-pdo-mysql/

Community
  • 1
  • 1
Harry Bosh
  • 3,611
  • 2
  • 36
  • 34
  • 3
    [Query caching is gone](https://dev.mysql.com/doc/refman/5.7/en/query-cache.html) anyway: *The query cache is deprecated as of MySQL 5.7.20, and is removed in MySQL 8.0.* – Álvaro González Jun 15 '18 at 10:54
5

For the record

PDO::ATTR_EMULATE_PREPARES=true

It could generate a nasty side effect. It could return int values as string.

PHP 7.4, pdo with mysqlnd.

Running a query with PDO::ATTR_EMULATE_PREPARES=true

Column : id
Type :integer
Value : 1

Running a query with PDO::ATTR_EMULATE_PREPARES=false

Column : id
Type :string
Value : "1"

In any case, decimal values are always returned a string, regardless of the configuration :-(

magallanes
  • 6,583
  • 4
  • 54
  • 55
  • 3
    decimal values are always returned a string is the only correct way – Your Common Sense Jul 27 '20 at 14:56
  • Yes from the viewpoint of MySQL but it's wrong at the PHP side. Both Java and C# consider Decimal as a numeric value. – magallanes Jul 27 '20 at 17:54
  • Nope, it is not. It as all correct for the entire computer science. If you think it is wrong, then you need another type, of arbitrary precision – Your Common Sense Jul 27 '20 at 17:56
  • @YourCommonSense Care to dig a little deeper into why you think that? – Robert McKee Aug 27 '20 at 03:25
  • @RobertMcKee you need to think why there is a decimal type when there is already a float type. – Your Common Sense Aug 27 '20 at 03:40
  • 2
    @YourCommonSense a float can not accurately represent 0.2, while a decimal can. But, that's a PHP problem, not a entire computer science problem. Many languages (and databases) have intrinsic data types that can represent numbers such as 0.2 accurately. PHP can, but there is no intrinsic data type in the base language. But to say decimal values are always returned as a string is the only correct way is meh. That is assuming you'd prefer 0.2 to be accurately represented over "12" being sorted before "2". Also "2" not being equal to "2.0". – Robert McKee Aug 27 '20 at 05:11
  • 1
    All right, it's "a PHP problem". In PHP, decimal values are always returned a string is the only correct way – Your Common Sense Aug 27 '20 at 06:03
  • The edit queue for this post is full, so I'll add it here: You switched the values of `PDO::ATTR_EMULATE_PREPARES` - `true` will return everything as strings, `false` will allow returning ints and floats. – Adrian Wiik May 12 '22 at 08:04