19

I have a special case requiring that I generate part of a SQL WHERE clause from user supplied input values. I want to prevent any sort of SQL Injection vulnerability. I've come up with the following code:

private String encodeSafeSqlStrForPostgresSQL(String str) {

  //Replace all apostrophes with double apostrophes
  String safeStr = str.replace("'", "''");

  //Replace all backslashes with double backslashes
  safeStr = safeStr.replace("\\", "\\\\");

  //Replace all non-alphanumeric and punctuation characters (per ASCII only)
  safeStr = safeStr.replaceAll("[^\\p{Alnum}\\p{Punct}]", "");

  //Use PostgreSQL's special escape string modifier
  safeStr = "E'" + safeStr + "'";

  return safeStr;
}

Questions:

  • Do you see any issues?
  • Can you provide a better solution?
  • Are there any existing libraries to help with this?

Notes:

  • This is a common question on SO and elsewhere, but the only answer I've seen is to always use PreparedStatements. Fwiw, I'm using JasperReports. I want to keep the query inside of JasperReports. The built-in Jasper parameter functions for query parameter handling (including the X{} functions) are not sufficient for what I need to parametrize. I could try creating a custom Jasper QueryExecutor that would allow me to inject my own X{} functions, but that's more complicated than just generating a dynamic SQL where clause with Jasper's $P!{} syntax.

  • I looked at the OWASP libraries. They do not have a PostgresSQL codec yet. I looked at the OracleCodec though and its escaping seemed simplistic. I'm not sure it would be of much helping preventing SQL injection attacks.

  • In my code I'm adding the E so as to not be dependent on PostgreSQL's standard_conforming_strings setting. Ideally I wouldn't have to add that and then the function wouldn't have to be PostgreSQL specific. More info: http://www.postgresql.org/docs/9.0/static/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE .

Ideally I would've liked a more generic and robust solution that I knew would be safe and support all possible UTF-8 strings.

kaliatech
  • 17,579
  • 5
  • 72
  • 84
  • 1
    `BaseConnection.escapeString()` seems to cover this http://jdbc.postgresql.org/development/privateapi/org/postgresql/core/BaseConnection.html#escapeString(java.lang.String) – Frank Farmer Mar 16 '12 at 17:05
  • 1
    @FrankFarmer Great idea to look at JDBC driver source. Looking at BaseConnection lead me to the Utils class: http://jdbc.postgresql.org/development/privateapi/org/postgresql/core/Utils.html . Looking at source for that, they switch on the conforming flag, and then escape quotes/apostrophes similar to what I've done. They only have special handling for the \0 character and let everything else pass through. So... I guess that's safe and my removal of every non-standard character beyond \0 is overkill? Feel free to post your comment as an answer that I can accept. – kaliatech Mar 16 '12 at 18:00

2 Answers2

21

The most easiest way would be to use PostgreSQL's Dollar Quoting in the combination with a small random tag:

  • For each invocation calculate a small, random tag (e.g 4 characters) (redundant)
  • Look whether or not the quote tag is part of the input string.
  • If it is, recalculate a new random tag.
  • Otherwise build your query like this:

    $tag$inputString$tag$
    

This way you escape the whole hassle of different nested quoting techniques and you also set up a moving target by using a random tag.

Depending on your security requirements this might do the job or not. :-)

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
A.H.
  • 63,967
  • 15
  • 92
  • 126
  • 1
    Interesting. I wasn't aware of the dollar quoted string syntax. I tried it just now and it did seem to work. Two notes though: 1) You are correct to point out the need to randomize the surrounding tags for security reasons. I think that is the main negative regarding this approach. 2) Via testing, I've determined that the dollar syntax still requires that \0 characters be removed from the string (otherwise an exception is thrown from Postgres...which is still better than allowing a vulnerability through). – kaliatech Mar 16 '12 at 19:15
  • 1
    +1 for dollar quoting. There is a redundancy in your logic, though. If you check the user input for quote-tag anyways, there is no need to randomize initially. You only have to mutate, if the quote tag should pop up, which practically only happens if an attacker has read your code. Only then you have to mutate, which foils the attack vector. I took the liberty to cross out the redundant step. – Erwin Brandstetter Mar 17 '12 at 06:25
  • @ErwinBrandstetter: Not using a random tag right from the start migh open some holes if the attacker might mask his usage of the task somehow. This is a quite a bit handwaving right now, but I could imagine some encoding issues between server encoding and client encoding might do the trick. ISO 2022 (sic?) would be my first guess. But I will think about this while being on a slow line :-) – A.H. Mar 17 '12 at 17:57
  • @kaliatech: Regarding the `\0` removal: I guess this is the case, because Java does not use UTF-8 in the strict sense - it does not encode the Unicode codepoint 0x0000 as the byte `\0` but as a two-byte sequence. Other UTF-8 receviers don't know about this of course. And in C land `\0` can never be be part of a valid string. So this is a security issue not only for SQL/PostgreSQL but for anything leaving the Java domain. – A.H. Mar 17 '12 at 18:04
7

I asked a similar question here, but I think that the best thing to do is to use org.postgresql.core.Utils.escapeLiteral. This is a Postgres library so using it should be safe. If/when Postgres adds new string delimiters this method should be updated.

Community
  • 1
  • 1
sixtyfootersdude
  • 25,859
  • 43
  • 145
  • 213