sql injection - how to sanitize program generated sql clause?

Question

in standard Ajax, where and order by SQL clauses are provided by the program (not user), eg

var url = ".select?dd=emp&where="+escape("emp_tp='abc' and hire_dt<current_date-'2 years' and super_emp_id is distinct from emp_id")

answered on the server by

$where = (isset($_GET['where'])) ? pureClause($_GET['where']) : null;
$order = (isset($_GET['order'])) ? pureClause($_GET['order']) : null;
...
$query = $query.(($where)?" where $where":'').(($order)?" order by $order":'');

the question is what should function pureClause look like?

right now pureClause simply raises error if any of the following exist:

; select insert update delete drop create truncate

if other injection causes query failure, that's fine, as long as data undamaged.

to me this seems adequate, but in my heart, I know I'm wrong.

Clarifications:

prepared statements in Postgres, although very fast, are a pain to set up and maintain - they're ok for well used queries but not custom queries.
creating a prepared statement for each transaction is a huge db hit. much preferred if security can be attained in at the app level.

Lastly, consider the where clause

emp_tp='abc' and hire_dt=current_dt-'2 years' and super_emp_id is distinct from emp_id

how many placeholders here? this needs to be parsed correctly before being fed into a prepared statement with placeholders, right? or am I completely missing the boat?

Primary facts:

not practical to write a SQL clause parser for parameterized prepared statements
not practical to write a SQL clause sanitizer that guarantees no harm

Solution:

for SELECTS, where the random SQL can be a problem: since it's too hard to protect the database, let the database protect itself! have different users have different roles / permissions. use a read-only user for selects. for normal SQL, this guarantees no DML from these statements.

best practices: four db user accesses

developer, do everything (never use as connection in web app)
dml - can select / dml on almost everything (must use for dml)
read - can select (use for all selects, whether prepared or text)
login - can only execute login/password functions (used in login process)

password protection:

dml and read may not access password data, either through select or dml
login should access password data only through protected functions, eg,

     function login( username, password ) - returns user_id
     function set_password( usr_id, password ) - sets password

only login may run the login() and set_password() functions
depending on your database, login may need sql access to password columns
depending on your database, the password column may be protected itself; if not, then should be moved out of the user table into its own secure table

setting this up in mysql, using the administrator tool, took about 30 minutes, including time to write the login functions and split out the password column.

Use placeholders. A dynamically generated query does not change this. I find this approach to be *less* ideal than the (should be) outdated approach of `foo_real_escape_string`. What if I *wanted* to look for "drop"? — , Sep 03 '11 at 07:19
@pst - 1) don't placeholders imply prepared statements? 2) if the where clause is something like "emp_tp='abc' and hire_dt>current_date-'2 years' and super_emp_id!=emp_id" would require parsing the clause and creating a placeholder for each section and pulling out constants versus column names etc - am I missing something? — cc young, Sep 03 '11 at 07:27
Ah. A beautiful example of an intelligent question where the "use prepared statements" meme is not the answer. Sadly, it's probably still going to end up the highest voted contribution. Closely related: [Escaping field names in PDO statements](http://stackoverflow.com/q/1542627) — Pekka, Sep 03 '11 at 07:58
@Pekka I didn't suggest using field names as prepared statement parameters. The [question](http://stackoverflow.com/q/1542627) you linked to has a very peculiar requirement wherein the field names are part of the user input. Here I would just map the incoming field name params to actual table field names and place the incoming input into prepared statement parameters. — Sahil Muthoo, Sep 03 '11 at 10:14
prepared statements are too much work, so column names are a side issue. the core issue is, what is the least we can do to ensure a valid statement, with unsure where and clauses, executes no dml? thereby my idea of raise exception if clause has a dml verb therein. — cc young, Sep 03 '11 at 10:34

rook · Answer 1 · 2011-09-03T23:56:03.457

5

What you are doing is the very definition of sql injection and cannot be sanitized. You cannot pass in a WHERE clause in a safe fashion period end of story. You must build this part of the query on the server side. The fact that you didn't recognize this means you MUST read more about sql injection, clearly asking StackOverflow is an insecure approach to this problem. The fear is that you may never learn the fundamentals of this vulnerability.

$order can be done in a secure way with a white list. For example:

if(in_array($_GET['order'],$list_of_rows)){
   $order=$_GET['order'];
}

If you are passing in a table name or column name make sure you check it against a white list, or this will be sql injection.

edited Sep 03 '11 at 23:56

answered Sep 03 '11 at 23:50

rook

66,304
38
162
239

first, have read and will read more on sql injection. second, sql injection here only hurts if it results in dml. in other words, the where and order clauses _can_ be contaminated as long as they don't hurt the data - if it errors, for example, that's fine. with these very limited goals, do you still think it's impossible to write an adequate filter? – cc young Sep 04 '11 at 02:09
@cc young The danger will look something like this: `$order="(select if(substr(password,0,1),0,sleep(30)) from mysql.user where Name='root')"`. Sub-selects and union selects can be used to obtain data from another table, or even another database. A whitelist is required, a blacklist of nasty characters will not work. – rook Sep 04 '11 at 16:20
@cc young Also in this example i am using blind sql injection, in that if the attacker correctly guesses one character in of the password hash then it will take 30 seconds for the HTTP request to complete. This is called an "out of band" attack. – rook Sep 04 '11 at 16:25
great examples! in my submitted answer, ie, a read-only or safe connection, it would also be necessary to exclude access to `users` table and/or particular columns. used to do this all the time in client / server environment where each person had his own connection to the db and was assigned different roles. instead of dropping the idea completely and going with one connection, am now convinced a hybrid approach is correct - different connections with different security flavors, even if always using prepared statements. – cc young Sep 05 '11 at 03:58
@cc young I am apprehensive, i don't think this solves the root of the problem. I still think paramterized queries is the best solution. Keep in mind this is the first sql injection exploit you have seen, your adversaries have been studding them for years. exploit-db.com – rook Sep 05 '11 at 05:53

cc young · Accepted Answer · 2011-09-04T14:13:07.937

2

Got it! Routing all of these queries through a database user (connection) who has only been granted SELECT privileges on the database!

Attempted DML will choke. This does not prevent DoS attacks (lots of ways to do that!), but does protect the data. Nor does the make for secure queries, like login. But for client generated WHERE and ORDER, with the goal of preventing DML, this should work just fine.

Ten/fifteen years ago always set up different users for different roles, but with app layer etc etc got out of the habit. It's probably a good idea to re-invest in those principles.

Unless hear differently will mark this as correct answer - it satisfies all criteria, howbeit it dodges the theoretically impossible challenge fo writing a sanitizer.

edited Sep 04 '11 at 14:13

answered Sep 04 '11 at 02:19

cc young

18,939
31
90
148

I wrote the placeholder comment before seeing the advance-case requirement. Unfortunately the advanced requirements makes it all but impossible to do safely without a *proper* SQL expression parser and syntax rebuild for the database. +1 for this solution, if such control is really needed for the user; I had entirely forgotten about being able to impose separation of concerns. It might take a little bit of discipline to logically isolate the connections and prevent accidental "cross-talk" in code. – Sep 04 '11 at 17:19
@pst - thanks. really think having a read-only connection around all the time is really good idea. dml statement are typically easy to parameterize - use them on the dml connection, selects on the read-only. not addressed by anyone is the _db cost_ of prepared statements. since selects are frequently ad hoc, the read-only connection saves both the work of parsing and the extra cost to the db – cc young Sep 05 '11 at 03:42

Sahil Muthoo · Answer 3 · 2011-09-03T06:57:40.867

1

Always use prepared statements. It will handle input escaping and avoid sql injection. There will be no need for hacks like pureClause. Check out mysqli_stmt::prepare()

edited Sep 03 '11 at 06:57

answered Sep 03 '11 at 06:52

Sahil Muthoo

12,033
2
29
38

To comment further, if he's trying to build up a query via ajax to send over, chunk the pieces into a json (i.e. {"where": {"logic": "and", comparisons:[{"key": "woot", "val": "yelp"}, {"key": "woot2", "val": "yelp2", "comparison": "!="}]}}. Then he can run through his own custom query builder and build up a real nice prepared statement. – Stephen Sep 03 '11 at 07:04
agree, and in postgres I always do for common tasks - gives speed and protection. but prepared statements are expensive to maintain - creating a _custom_ prepared statement for each sql interaction simply for security seems to me a hugh unnecessary hit if security can be otherwise achieved by the app layer. having to use mysql, saw no prepared statements in the php mysql library. will checkout mysqli_stmt, but, that said, no assurances of prepared statements with other interfaces, so would still like a decent pureClause function – cc young Sep 03 '11 at 07:08
PDO also supports placeholders. +1 for *not* recommending `mysql_real_escape_string` or ilk. (Placeholders can even be used *with* dynamically generated SQL statements so there really is no excuse.) – Sep 03 '11 at 07:16
2

-1: Both `mysql_real_escape_string` and prepared statements are [utterly useless](http://stackoverflow.com/q/1542627) when it comes to sanitizing column names. This is just reiterating a meme that is often the right recommendation, except when it's not. – Pekka Sep 03 '11 at 07:59
@Pekka I didn't suggest using field names as prepared statement parameters. The [question](http://stackoverflow.com/q/1542627) you linked to has a very peculiar requirement wherein the field names are part of the user input. Here I would just map the incoming field name params to actual table field names and place the incoming input into prepared statement parameters. – Sahil Muthoo Sep 03 '11 at 10:11
-1 this doesn't help when you are building a query like this. – rook Sep 03 '11 at 23:44

score 1 · Answer 4 · edited Sep 03 '11 at 23:56

1

As @Stephen suggested, provide WHERE as an object, then parse the object and generate safe SQL var where = { emp_tp: { condition: equal value: 'abc' } var order = { emp_tp: 'ASC' }

send it as json:

var params = {
  w: where,
  o: order
}
$.post(url,params,function(result){...}, 'json');

And in PHP

$where = isset($_POST['w']) ? json_decode($_POST['w') : array();
if (!empty($where)) {
  foreach ($where as $field => $data) {
     // validate that field exists
     // validate that operator is valid
     $sql .= sprintf('%s %s "%s"', $field, $data->operator, mysql_escape_string($data->value));
  }
}

edited Sep 03 '11 at 23:56

rook

66,304
38
162
239

answered Sep 03 '11 at 07:26

Maxim Krizhanovsky

26,265
5
59
89

consider "emp_tp='abc' and hire_dt=current_dt-'2 years' and super_emp_id!=emp_id" - the parsing can be nontrivial. what I _do_ have going for me is that it's perfectly ok if the query fails (so no need to check valid field etc) as long as no harm done – cc young Sep 03 '11 at 07:35
@cc I think you *will* need to check valid field names, because as far as I know, there is no method to escape them. +1 - this is at least going in the right direction – Pekka Sep 03 '11 at 08:01
@Pekka - in this case, as stated above, failure is ok. the only thing _not_ ok is data being mangled. because of this, see no need to check column names. but could be wrong. – cc young Sep 03 '11 at 08:03
@cc young so sanitation (preventing SQL injection) is not an issue? – Pekka Sep 03 '11 at 08:04
@Pekka - sanitation inasmuch as no data begin mangled - that's why cheking for ;, update, etc. normal failure, if caught, hurts nothing. – cc young Sep 03 '11 at 08:13
1

@Darhazer this is potentially hazards a field could legitimately exist, such as `password` in the user table, but by building a malicious where clause you could leak the password hash belonging to another user. – rook Sep 03 '11 at 23:58

sql injection - how to sanitize program generated sql clause?

4 Answers4

Linked