2

I took a slight peek behind the curtain at the MySQLdb python driver, and to my horror I saw it was simply escaping the parameters and putting them directly into the query string. I realize that escaping inputs should be fine in most cases, but coming from PHP, I have seen bugs where, given certain database character sets and versions of the MySQL driver, SQL injection was still possible.

This question had some incredibly detailed responses regarding the edge cases of string escaping in PHP, and has led me to the belief that prepared statements should be used whenever possible.

So then my questions are: Are there any known cases where the MySQLdb driver has been successfully exploited due to this? When a query needs to be run in a loop, say in the case of an incremental DB migration script, will this degrade performance? Are my concerns regarding escaped input fundamentally flawed?

Community
  • 1
  • 1
  • Can you give a simple edge case? – Peter Wood Apr 17 '17 at 22:50
  • I'm not sure which mysql module we're talking about here, so I have to ask: Does it _force_ you to use potentially vulnerable features; does it not support prepared statements? – Aran-Fey Apr 17 '17 at 23:07
  • 1
    Could you include a link to the "MySQLdb" part where the "escaping" happens or how? It's most likely not calling `mysql_real_escape_string` :) – MSeifert Apr 17 '17 at 23:12
  • @MSeifert: Looks like it's [this bit](https://github.com/farcepest/MySQLdb1/blob/master/MySQLdb/cursors.py#L164), which interpolates the query parameters with Python `%` formatting instead of using any MySQL prepared statement functionality. – user2357112 Apr 17 '17 at 23:44
  • 1
    @MSeifert: Also, [yes, it is literally calling `mysql_real_escape_string`](https://github.com/farcepest/MySQLdb1/blob/d34fac681487541e4be07e6978e0db233faf8252/_mysql.c#L1127). – user2357112 Apr 17 '17 at 23:54
  • Other bizarre highlights include [using regexes to parse queries in `execute_many`](https://github.com/farcepest/MySQLdb1/blob/9b0b59f1bbe0029757b9b15d1c71586eaca1109c/MySQLdb/cursors.py#L233). – user2357112 Apr 18 '17 at 00:02
  • @user2357112 Are you sure that's the correct repository? It seems like development there has stopped at least three years ago "Latest commit d34fac6 on 2 Jan 2014". If so there are probably *all kinds* of security issues. – MSeifert Apr 18 '17 at 00:11
  • @MSeifert: It's the one linked on the [PyPI page](https://pypi.python.org/pypi/MySQL-python/1.2.5). Development on [what was supposedly going to be its successor](https://github.com/farcepest/moist) stopped even before that. I think we're supposed to use [this thing](https://dev.mysql.com/doc/connector-python/en/) instead. – user2357112 Apr 18 '17 at 00:15
  • Sorry for the late response, I just got home from work. Yes, @user2357112 that is the bit I'm talking about. – Timothy Bess Apr 18 '17 at 03:32

3 Answers3

0

...prepared statements should be used whenever possible.

Yes. That's the best advice. If the prepared statement system is broken there will be klaxons blaring from the rooftops and everyone in the Python world will pounce on the problem to fix it. If there's a mistake in your own code that doesn't use prepared statements you're on your own.

What you're seeing in the driver is probably prepared statement emulation, that is the driver is responsible for inserting data into the placeholders and forwarding the final, composed statement to the server. This is done for various reasons, some historical, some to do with compatibility.

Drivers are generally given a lot of serious scrutiny as they're the foundation of most systems. If there is a security bug in there then there's a lot of people that are going to be impacted by it, the stakes are very high.

The difference between using prepared statements with placeholder values and your own interpolated code is massive even if behind the scenes the same thing happens. This is because the driver, by design, always escapes your data. Your code might not, you may omit the escaping on one value and then you have a catastrophic hole.

Use placeholder values like your life depends on it, because it very well might. You do not want to wake up to a phone call or email one day saying your site got hacked and now your database is floating around on the internet.

tadman
  • 208,517
  • 23
  • 234
  • 262
0

I can't point to any known exploit cases, but I can say that yes, this is terrible.

The Python project calling itself MySQLdb is no longer maintained. It's been piling up unresolved Github issues since 2014, and just quickly looking at the source code, I can find more bugs not yet reported - for example, it's using a regex to parse queries in execute_many, leading it to mishandle any query with the string " values()" in it where values isn't the keyword.

Instead of MySQLdb, you should be using MySQL's Connector/Python. That is still maintained, and it's hopefully less terrible than MySQLdb. (Hopefully. I didn't check that one.)

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Wow, I had no idea that was a totally different thing. I thought the pip package _was_ the MySQL connector because they had similar method names and functionality. Will definitely be switching to that, thanks. – Timothy Bess Apr 18 '17 at 03:37
  • Looks like MySQL still left that weird regex in there and emulates prepared statements by default, but they offer the MySQLCursorPrepared class that will use prepared statements (and caches them for reuse making it faster for looping queries). I appreciate all the other answers, but this one offered a solution to my issues, so I must choose this one. – Timothy Bess Apr 18 '17 at 04:12
-1

Using prepared statements is faster than concatenating a query. The database can precompile the statement, so only the parameters will be changed when iterating in a loop.