Reading some of the SQL injection questions and answers on SO, I saw this answer that suggests you can convert untrusted user input to hex, which will by its very nature not require escaping of any kind, and thus completely and utterly avoid the possibility of SQL injection.
What database abstraction layer you use (PDO, mysqli, mysql, Pear DB, etc.) does not matter.
Example of a normal working query:
$DBH = new PDO('mysql:host=127.0.0.1;dbname=test', 'test', 'testpassword');
// could have been:
//$bookTitle = bin2hex($_GET['title']);
$bookTitle = bin2hex('Catch-22');
$query = "SELECT * from `books` WHERE `title` = UNHEX('$bookTitle')";
foreach ($DBH->query($query) as $row) {
echo "\n<br />\n";
print_r($row);
echo "\n<br />\n";
}
I've included enough code here for you to quickly run a test if you have a database with a table such as:
CREATE TABLE `books` (`id` INT, `title` VARCHAR(100), `author` VARCHAR(100)) ENGINE=InnoDB CHARACTER SET='utf8';
INSERT INTO `books` VALUES(1, 'Catch-22', 'Joseph Heller');
Of course this is oversimplified - you'd normally have input validation, output sanitizing and lots of other abstractions but we want to focus on the question - no more fluff than necessary to help provide easily executed examples.
What I'd like to know is if there are any technical weaknesses to this technique. I am specifically not asking if this technique has human weaknesses (easier to futz as a sloppy programmer, since it is clearly not as clean as using parameterized queries).
Yes, we can all agree that parameterized queries are less susceptible to bad programming or unfortunate oversights. So please stick to the question - does this technique help avoid SQL injections of all kinds, unconditionally?
Can someone show an example of user input that would break this technique? Even a corner case, some particular MySQL server settings or old PHP version that breaks it?
It works just as well for integer replacements:
// could have been:
//$bookID = bin2hex($_GET['id']);
$bookID = bin2hex(1);
$query = "SELECT * from `books` WHERE `id` = UNHEX('$bookID')";
Other thoughts:
With this technique you avoid two round trips to the database as happens if using (non-emulated) prepared statements (although prepared statements != parameterized queries).
The caveats to some other suggested techniques start to make one go cross-eyed what with the corner case exceptions such as this and this. Does the hex encoding technique avoid all possibility of attackers wreaking havoc using character set encoding tricks and whatever else is out there?
It appears that decoding on the database side may be limited to MySQL/MariaDB, although there may be third-party solutions for adding UNHEX() to PostgreSQL (or - not sure - in some databases, you might be able to use the other method of placing the hex literals in the query without using any
UNHEX
function)