10

Is there any function or library which can be used to clean the user input. Like for example if the user input a text called baily's then i should escape the ' before sending it to mysql query. Similarly i should be able to filter null characters and \n, \t, \r etc.. Like in PHP we have mysql_real_escape_string($input) is there anything in Java to do this ?

Bhushan
  • 18,329
  • 31
  • 104
  • 137
Deepak
  • 6,684
  • 18
  • 69
  • 121
  • If you use an ORM layer (which is common practice in Java) or even just Statements with named or enumerated parameters then the escaping of SQL parameters will be done automatically for you. So long as you're not manually concatenating strings together to build your queries, you should be safe. – aroth Jul 11 '11 at 11:56
  • 1
    **Also See :** [`JDBC - how to escape user-supplied parameters with a sql query`](http://stackoverflow.com/questions/4954002/jdbc-how-to-escape-user-supplied-parameters-with-a-sql-query) – jmj Jul 11 '11 at 11:58

5 Answers5

11

In Java, you don't usually do this by hand.

Instead you'll use a PreparedStatement and pass in any arguments to your SQL statement via explicit setString() or setObject() methods.

This way the JDBC driver will handle it (either by doing the necessary escaping or by sending the SQL statement separately form the arguments, depending on the DB).

For example, your code could look like that (using prepareStatement()):

Connection c = ...; // get Connection from somehwere
PreparedStatement stmt = c.prepareStatement("SELECT * FROM BOOKS WHERE TITLE = ?");
stmt.setString(1, userInput);
ResultSet result = stmt.executeQuery();
Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
4

You use prepared statements with placeholders for this. See http://download.oracle.com/javase/tutorial/jdbc/basics/prepared.html

Just like you should in PHP: http://php.net/manual/en/pdo.prepared-statements.php

I mean no offense when i say that only idiots would use mysql_real_escape_string in favor of prepared statements with placeholders.

Christoffer Hammarström
  • 27,242
  • 4
  • 49
  • 58
2

You should use PreparedStatement and set the values of $input using the setString function.

The reason for using PreparedStatement is a each database may need to escape different things. This complexity is hidden in the concrete implementation of PreparedStatement provided by the database vendor.

Mathias Schwarz
  • 7,099
  • 23
  • 28
2

Short answer, no, except for very specific definitions of "clean". Right now you have to use a language specific solution -- for SQL, just use a prepared statement.

Longer answer, there has been recent work on automatic string sanitizers that figure out how to incorporate plain text content safely & correctly into content in other languages.

Automatic contextual auto-escapers exist for HTML in template languages like Soy, Go, a variant of jQuery, cTemplates, clearsilver and hopefully others soon.

There is research happening on generalizing this so that it can easily be extended to other languages. One idea that I'm working on is taking an annotated grammar that describes a target language like SQL and figure out what escaping needs to be done for holes that can be filled with user data.

Given a grammar like the below which includes annotations that show how the structure of data maps to substrings within the language:

JSONValue            := JSONNullLiteral
                      | JSONBooleanLiteral
                      | JSONObject
                      | JSONArray
                      | JSONString
                      | JSONNumber                                    ;
JSONObject           := @KeyValueMap ([{] JSONMemberList? [}])        ;
JSONMemberList       := JSONMember ([,] JSONMemberList)?              ;
JSONMember           := @Key JSONString [:] @Value JSONValue          ;
JSONNullLiteral      := @ValueNull "null"                             ;
JSONBooleanLiteral   := @ValueFalse "false" | @ValueTrue "true"       ;
JSONArray            := @List("[" (JSONValue ([,] JSONValue)*)? "]")  ;
JSONString           := @String ([\"] JSONStringCharacters? [\"])     ;
JSONNumber           := @Number (Sign? (Mantissa Exponent? | Hex))    ;
JSONStringCharacters := JSONStringCharacter JSONStringCharacters?     ;
JSONStringCharacter  := @Char ([^\"\\\x00-\x1f])
                      | JSONEscapeSequence                            ;
JSONEscapeSequence   := "\\" @Char [/\\\"]
                      | @Char{[\x08]} "\\b"
                      | @Char{[\x0c]} "\\f"
                      | @Char{[\x0a]} "\\n"
                      | @Char{[\x0d]} "\\r"
                      | @Char{[\x09]} "\\t"
                      | @Char ("\\u" @Scalar (hex hex hex hex))       ;
Mantissa             := (Integer ([.] Fraction?) | [.] Fraction)      ;
Exponent             := [Ee] Sign? decimal+                           ;
Integer              := [0] | [1-9] [0=9]*                            ;
Fraction             := [0-9]+                                        ;
Hex                  := [0] [Xx] hex+                                 ;
Sign                 := [+\-]                                         ;

we can build a state machine like the below:

enter image description here

which converts sequences of events (start, start_object, start_key, character 'x', ...) into instructions that encode characters onto a buffer.

From that state machine we can also generate generic traces of instructions to use to generate efficient code for encoders, and hopefully the context analysis algorithms that figure out which encoders to apply when.

If this works, it would make it easy to incorporate into general-purpose programming languages, mechanisms for automatically & securely composing content in languages like SQL, HTML, etc. With tweaks to language definitions to allow execute_query to find the boundaries between programmer-specified-content and injected content in execute_query("SELECT * FROM Table WHERE ID=$ID") and use those to automatically escape the injected content, we can make that idiom just work as the programmer intends.

Mike Samuel
  • 118,113
  • 30
  • 216
  • 245
1

The escaping you use is specific for the system you want to use the string for. If you use MySQL, you have to do different escaping than if you want to use the string in a piece of Javascript.

So to answer your question, we need to know how you want to escape your string. Chances are you don't need to escape the string before you use it in a database context. For example, if you use prepared queries, you don't need to escape your values.

Sjoerd
  • 74,049
  • 16
  • 131
  • 175