5

I was reading last night on preventing SQL injections, and I ran across this answer:

How can I prevent SQL injection in PHP?

The comments from 'Your Common Sense' made it sound like that was dysfunctional/unsafe. However, in my (albeit limited) testing, I found that php's "bin2hex($var)" worked with anything I threw at it - literal number, number string, string of text - even when matching a numerical (tinyint) column.

My question is this: Is there a way to inject SQL when every user input is sanitized via hexing it? In essence, any time a query was made, it would look something like this:

$query="SELECT * FROM table WHERE someidentifier=UNHEX('".bin2hex($unsafe_user_input)."') LIMIT 1"

Basically translating to:

SELECT * FROM table WHERE someidentifier=UNHEX('0b99f') LIMIT 1

Are there any holes in this type of security?

PS - I'm not just looking for answers like "Why not just use PDO or MySQLi with prepared statements?" It may fall under the vast evil of preemptive optimization, but I'd rather not double my query overhead (and yes, I do understand that it can be faster with multiple identical queries, but that's not a situation I often encounter).

Community
  • 1
  • 1
  • 3
    don't roll your own sanitization functions. this is LIKELY safe, but why invent your own "maybe-safe" system when you could use standard "guaranteed safe" versions? – Marc B Mar 21 '14 at 19:38
  • Are there methods of sanitizing in PDO and/or MySQLi (preferably the latter) that don't involve doubling the overhead of a given query? I read something about a filter on w3schools, but I know they're not a terribly trustworthy source... – Joshua Broekhuijsen Mar 21 '14 at 19:43
  • you can roll your own queries by manually escaping, e.g. `real_escape_string`. The overhead of preparing a query is real, but it's very minor in the grand scheme of things. Just don't make the mistake of preparing the query INSIDE an insert loop. Preparing should be done ONE. you then just execute that prepared statement multiple times. – Marc B Mar 21 '14 at 19:46
  • Just remember that the people who say "don't write your own queries, use pdo/mysqli+prepared statements" to prevent sql injection attack never seem to realize that you can still write entirely injectable queries no matter WHAT db library you're using. They're tools. It's not the chainsaw's fault if you saw off you leg. PDO/mysqli can HELP to write safe queries, but they can't MAKE you write safe queries. – Marc B Mar 21 '14 at 19:48
  • Very good things to note, thank you for pointing those out. Are you able to think of a way this could be injected-into? – Joshua Broekhuijsen Mar 21 '14 at 19:51
  • Adds bigger overheads with searching, especially partial searching and can easily return false positives - consider `LIKE '% . bin2hex('BC') . '%'` can return `ABCD` but also `D$0`; and you can forget about full text searching or regexp searching because there's no word boundaries in your content – Mark Baker Mar 21 '14 at 20:15
  • -1 for the preemptive optimization – Your Common Sense Mar 31 '14 at 15:23

4 Answers4

9

Is there a way to inject SQL when every user input is sanitized via hexing it?

If you knew why an SQL injection occurs, you would be able to answer this question yourself.


Let’s see. The CWE describes SQL injections (CWE-89) as follows:

The software constructs all or part of an SQL command using externally-influenced input […], but it does not neutralize or incorrectly neutralizes special elements that could modify the intended SQL command […]

Furthermore:

Without sufficient removal or quoting of SQL syntax in user-controllable inputs, the generated SQL query can cause those inputs to be interpreted as SQL instead of ordinary user data.

So basically: externally-influenced inputs in a generated SQL query are not interpreted as intended. The important part here is: not interpreted as intended.

If a user input is intended to be interpreted as a MySQL string literal but it isn’t, it’s an SQL injection. But why does it happen?

Well, string literals have a certain syntax by which they are identified by the SQL parser:

A string is a sequence of bytes or characters, enclosed within either single quote (“'”) or double quote (“"”) characters.

Additionally:

Within a string, certain sequences have special meaning […]. Each of these sequences begins with a backslash (“\”), known as the escape character. MySQL recognizes the escape sequences shown in Table 9.1, “Special Character Escape Sequences”.

Additionally, to be able to use quotes within string literals:

There are several ways to include quote characters within a string:

  • A “'” inside a string quoted with “'” may be written as “''”.
  • A “"” inside a string quoted with “"” may be written as “""”.
  • Precede the quote character by an escape character (“\”).
  • A “'” inside a string quoted with “"” needs no special treatment and need not be doubled or escaped. In the same way, “"” inside a string quoted with “'” needs no special treatment.

As all these latter mentioned sequences are special to string literals, it is necessary that any data, that is intended to be interpreted as a string literal, is properly processed to conform to these rules. This means in particular: if any of the mentioned characters are intended to be used in a string literal, they have to be written as one of the mentioned ways.

So if you look at it from this way, it is not even a question of security but simply of processing data so that they are being interpreted as intended.

The same applies to the other literals as well as other aspects of SQL.


So what about your question?

My question is this: Is there a way to inject SQL when every user input is sanitized via hexing it? In essence, any time a query was made, it would look something like this:

$query="SELECT * FROM table WHERE someidentifier=UNHEX('".bin2hex($unsafe_user_input)."') LIMIT 1"

Yes, that would be safe from SQL injections. bin2hex returns a string that contains only hexadecimal characters. And neither of these characters require a special treatment when using them in a MySQL string literal.

But seriously, why would anyone want to use these cumbersome formatting technique when there libraries and frameworks that supply convenient techniques like parameterized/prepared statements?

Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • 1
    Great answer, quite detailed. I did note in my question why I was avoiding parameterized statements, though I suspect it would probably not be a huge difference in performance - in which case, this is merely an exercise in programming. – Joshua Broekhuijsen Mar 21 '14 at 22:50
  • @JoshuaBroekhuijsen You have to [differentiate between emulated prepared statements and real prepared statements](http://stackoverflow.com/q/17909212/53114). – Gumbo Mar 21 '14 at 22:53
0

While i'm not familiar with hexing, i have successfully used Base64 to prevent mysql injections in the past from various scripts.

bdb.jack
  • 147
  • 1
  • 8
  • Sounds like much the same method - what's the MySQL function you used to decode the base64? (pseudocode example?) – Joshua Broekhuijsen Mar 21 '14 at 19:40
  • I've never done it directly in MySQL before. I usually use PHP's base64 encode and decode functions, but i found this while looking online: http://stackoverflow.com/questions/358500/base64-encode-in-mysql – bdb.jack Mar 21 '14 at 19:42
  • Hm, that's interesting. Dunno for sure if my production environment will be on MySQL>=5.6, but that's definitely something to consider if is. Could help reduce query size (the hex looks rather bloated). – Joshua Broekhuijsen Mar 21 '14 at 19:46
  • Base64 increases the size of the information (the string) by about 33%, but has the benefits of obfuscating any input and not requiring a specific data encoding (meaning that special characters that would normally require UTF-8 can still be stored without it) – bdb.jack Mar 21 '14 at 19:48
  • That's a really good point, then any character encoding is handled by the php exclusively. The major disadvantage I see to storing things in the database as base64 strings would sorting them alphabetically by say, last name or something (in phpMyAdmin, or what have you) would be ineffective. – Joshua Broekhuijsen Mar 21 '14 at 19:55
  • If you are looking to sort (or even search) via an interface such as phpMyAdmin, you're right. However if you are looking to use PHP to sort AFTER a query's results have been returned, then you can still sort by any key. – bdb.jack Mar 21 '14 at 20:02
  • @JoshuaBroekhuijsen it is **not** the same, it is **not** interesting, and it **not** enough to prevent injections. – Your Common Sense Mar 31 '14 at 15:19
0

Are there any holes in this type of security?

There are no holes but no advantages as well. Conventional string formatting is equally safe but without all this useless hexing/unhexing stuff.

So, bloating your queries with this unhex stuff is just superfluous.

As of yonder answer, initially it was written without unhexing and thus failed with numbers, which made it unusable. While after adding unhex it become just useless and superfluous.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
0
$query="SELECT * FROM table WHERE someidentifier=UNHEX('".bin2hex($unsafe_user_input)."') LIMIT 1"

This is a simple and certain answer for protecting against SQL injection using PHP and MySQL.

I've seen comments that suggest that it is still possible to be open to SQL injection with prepared or parameterized SQL. I don't see any reason to fuss with all that when this solution is simple and it works.

I've seen the arguments about CPU time and memory usage, but since this is typically used for user input typed on a screen, who cares about the RAM usage or CPU cycles (gasp!). The user types in 500 characters, the hex version is a thousand characters. Unless your server has a hundred thousand people doing this at the exact same moment, you aren't going to see much difference.

Creating clear, obvious, maintainable code is worth a few CPU cycles and some RAM.

And the technique is so simple, it is easy to remember. It is also unlikely to break or develop holes as PHP continues to evolve.

Doktor J
  • 1,058
  • 1
  • 14
  • 33