113

We are having another discussion here at work about using parametrized sql queries in our code. We have two sides in the discussion: Me and some others that say we should always use parameters to safeguard against sql injections and the other guys that don't think it is necessary. Instead they want to replace single apostrophes with two apostrophes in all strings to avoid sql injections. Our databases are all running Sql Server 2005 or 2008 and our code base is running on .NET framework 2.0.

Let me give you a simple example in C#:

I want us to use this:

string sql = "SELECT * FROM Users WHERE Name=@name";
SqlCommand getUser = new SqlCommand(sql, connection);
getUser.Parameters.AddWithValue("@name", userName);
//... blabla - do something here, this is safe

While the other guys want to do this:

string sql = "SELECT * FROM Users WHERE Name=" + SafeDBString(name);
SqlCommand getUser = new SqlCommand(sql, connection);
//... blabla - are we safe now?

Where the SafeDBString function is defined as follows:

string SafeDBString(string inputValue) 
{
    return "'" + inputValue.Replace("'", "''") + "'";
}

Now, as long as we use SafeDBString on all string values in our queries we should be safe. Right?

There are two reasons to use the SafeDBString function. First, it is the way it has been done since the stone ages, and second, it is easier to debug the sql statements since you see the excact query that is run on the database.

So then. My question is whether it really is enough to use the SafeDBString function to avoid sql injection attacks. I have been trying to find examples of code that breaks this safety measure, but I can't find any examples of it.

Is there anybody out there that can break this? How would you do it?

EDIT: To summarize the replies so far:

  • Nobody has found a way to get around the SafeDBString on Sql Server 2005 or 2008 yet. That is good, I think?
  • Several replies pointed out that you get a performance gain when using parametrized queries. The reason is that the query plans can be reused.
  • We also agree that using parametrized queries give more readable code that is easier to maintain
  • Further it is easier to always use parameters than to use various versions of SafeDBString, string to number conversions and string to date conversions.
  • Using parameters you get automatic type conversion, something that is especially useful when we are working with dates or decimal numbers.
  • And finally: Don't try to do security yourself as JulianR wrote. The database vendors spend lots of time and money on security. There is no way we can do better and no reason we should try to do their job.

So while nobody was able to break the simple security of the SafeDBString function I got lots of other good arguments. Thanks!

Community
  • 1
  • 1
Rune Grimstad
  • 35,612
  • 10
  • 61
  • 76
  • 16
    Your colleagues are way, way, off base. Challenge them to find a single piece of literature in support of their position. The argument ex neolithos is ridiculous, things change, only a person stuck in the stone age would fail to adapt. – annakata May 26 '09 at 13:03
  • 1
    Well, at least your colleagues protect against ONE of the different forms of hack... Are they sure thats all parameterized queries do? (I'm not...) – Arjan Einbu May 26 '09 at 13:10
  • For the error reporting, you could subclass SqlCommand and override its Execute* methods with "friendlier" exception handling that reported parameter name/value pairs, etc. – Chris Farmer May 26 '09 at 13:50
  • 1
    Any one vulnerability will not convince them. If you bring several vulnerabilities (which is what you are asking for) and other issues and point out one by one that parameters will solve that issue and that your team would have to write mountains of code to provide a fraction of the functionality, you may win them over. Good luck. – Robert Gowland May 26 '09 at 14:36
  • @Rune: No one here has really _tried_ to break SafeDBString, since we all know parameterized queries are the correct answer to the question. OTOH, I bet most of us here aren't hackers - they probably _have_ tried to break it. – John Saunders May 27 '09 at 15:16
  • 3
    Even without single quotes, you can still break your code with logic. Try using the username "test OR 1=1" - you get all rows returned rather than just the one with the username test! – Bridge Feb 27 '12 at 22:32
  • You can see a great example [here](http://stackoverflow.com/a/12235773/942659). In that example, your example success; but the answer that they provided fails :) – Mahdi Ghiasi Sep 10 '12 at 11:38
  • 1
    Sigh. I really don't get how we as an industry manage to keep tolerating this kind of unprofessional behavior. – jeroenh Sep 28 '16 at 13:21
  • Discussions of this type are a sign that some people are not doing real work. See if you can't shove a few bugs their way. – Andomar May 26 '09 at 13:11
  • It somewhat amuses me to see people go to great lengths when they can just use parametrized queries. I mean, they solve so many problems at once, it should be a no-brainer! I've yet to hear a sensible reason why NOT to use them. I mean, you don't output html using echo, now, do you? :P – shylent May 26 '09 at 13:42
  • I can think of two other cases where you would use this method over parameterisation. 1: Dynamic SQL for bulk insert operations. It is not possible to use parameters with Bulk Insert. 2: Where you have a variable number of distinct parameter values. Lets say I am building a create table statement where the field names and values are sourced from a table. Each field name and data type is a parameter and there can be any number of them. I can dynamically build a sp_executesql call, which itself would be vulnerable. Or I can just build a string and parse the inputs through SafeDBString – Jimbo Oct 17 '18 at 10:00

21 Answers21

82

I think the correct answer is:

Don't try to do security yourself. Use whatever trusted, industry standard library there is available for what you're trying to do, rather than trying to do it yourself. Whatever assumptions you make about security, might be incorrect. As secure as your own approach may look (and it looks shaky at best), there's a risk you're overlooking something and do you really want to take that chance when it comes to security?

Use parameters.

JulianR
  • 16,213
  • 5
  • 55
  • 85
  • Re "Use whatever trusted, industry standard library there is" - can you recommend one for .NET? Maybe more than one depending on the DB: SQLServer, MySQL, PostgreSQL? I've looked for SQL-sanitizer but without much luck, so hsve been forced to implement my own, as best I can (which is no doubt far from foolproof). – PSU Nov 29 '19 at 11:04
72

And then somebody goes and uses " instead of '. Parameters are, IMO, the only safe way to go.

It also avoids a lot of i18n issues with dates/numbers; what date is 01/02/03? How much is 123,456? Do your servers (app-server and db-server) agree with each-other?

If the risk factor isn't convincing to them, how about performance? The RDBMS can re-use the query plan if you use parameters, helping performance. It can't do this with just the string.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • I have tried the formatting and performance arguments, but they still aren't convinced. – Rune Grimstad May 26 '09 at 12:53
  • 5
    Actually, sql server can re-use the query plan whether you use parameters or not. I agree with the other arguments, but for most cases the performance argument for parameterized sql doesn't fly anymore. – tnyfst May 26 '09 at 15:07
  • 1
    @tnyfst: it can reuse the execution plan when the query string changes for every combination of parameter values? I did not think that possible. – John Saunders May 26 '09 at 16:29
  • 4
    The query plan will be reused if the query text is IDENTICAL to an earlier query text. So if you send the EXACT SAME query twice, it will be reused. However, if you change even just a space or a comma or something, a new query plan will have to be determined. – marc_s May 26 '09 at 18:44
  • @tnyfst - the point is that the query plan to fetch record "12345" can be re-used to fetch record "67890" if it uses a paramerized @id – Marc Gravell May 26 '09 at 19:40
  • 1
    @Marc: I'm not sure you are entirely correct. SQL Servers caching hueristics are a little weird. The parser is capable of identifying constants in the text and can convert the SQL string to one the uses parameters artificially. It can then insert into the cache the text of this new parameterised query. Subsequent similar SQL may find its parameterised version matched in the cache. However, parameterised versions aren't always used with the originaly SQL versions being cached instead, I suspect SQL has a zillion performance related reasons to choose between the two approaches. – AnthonyWJones May 28 '09 at 12:35
  • Indeed it is complex; but we can make the job simpler, and reduce the number of candidates... – Marc Gravell May 28 '09 at 13:05
  • By default SQL Server uses Simple Parametrization - ie it will auto parametrize simple queries; but not complex queries. To force SQL Server to parametrize complex queries also, you can set database parameter "PARAMETRIZATION" to "FORCED". Eg: ALTER DATABASE AdventureWorks; SET PARAMETERIZATION FORCED; – Dharmendar Kumar 'DK' Apr 05 '16 at 18:25
  • @DK or you can just employ competent employees and use parameters correctly ;) – Marc Gravell Apr 05 '16 at 19:38
27

The argument is a no-win. If you do manage to find a vulnerability, your co-workers will just change the SafeDBString function to account for it and then ask you to prove that it's unsafe all over again.

Given that parametrized queries are an undisputed programming best practice, the burden of proof should be on them to state why they aren't using a method that is both safer and better performing.

If the issue is rewriting all the legacy code, the easy compromise would be to use parametrized queries in all new code, and refactor old code to use them when working on that code.

My guess is the actual issue is pride and stubbornness, and there's not much more you can do about that.

Matthew Christensen
  • 1,741
  • 12
  • 15
19

First of all, your sample for the "Replace" version is wrong. You need to put apostrophes around the text:

string sql = "SELECT * FROM Users WHERE Name='" + SafeDBString(name) & "'";
SqlCommand getUser = new SqlCommand(sql, connection);

So that's one other thing parameters do for you: you don't need to worry about whether or not a value needs to be enclosed in quotes. Of course, you could build that into the function, but then you need to add a lot of complexity to the function: how to know the difference between 'NULL' as null and 'NULL' as just a string, or between a number and a string that just happens to contain a lot of digits. It's just another source for bugs.

Another thing is performance: parameterized query plans are often cached better than concatenated plans, thus perhaps saving the server a step when running the query.

Additionally, escaping single quotes isn't good enough. Many DB products allow alternate methods for escaping characters that an attacker could take advantage of. In MySQL, for example, you can also escape a single quote with a backslash. And so the following "name" value would blow up MySQL with just the SafeDBString() function, because when you double the single quote the first one is still escaped by the backslash, leaving the 2nd one "active":

x\' OR 1=1;--


Also, JulianR brings up a good point below: NEVER try to do security work yourself. It's so easy to get security programming wrong in subtle ways that appear to work, even with thorough testing. Then time passes and a year later your find out your system was cracked six months ago and you never even knew it until just then.

Always rely as much as possible on the security libraries provided for your platform. They will be written by people who do security code for a living, much better tested than what you can manage, and serviced by the vendor if a vulnerability is found.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • 5
    The replace function adds the apostrophes – Rune Grimstad May 26 '09 at 12:54
  • 5
    Then it's just one more source of bugs. How does it know the difference between NULL as a null value and NULL as a text string? Or between a number input and a string that just happens to contain digits? – Joel Coehoorn May 26 '09 at 13:00
  • Good point. You should only use the function for strings, and possibly dates, so you have to be careful. That is one more reason to use parameters! Yay! – Rune Grimstad May 26 '09 at 13:05
10

So I'd say:

1) Why are you trying to re-implement something that's built in? it's there, readily available, easy to use and already debugged on a global scale. If future bugs are found in it, they'll be fixed and available to everyone very quickly without you having to do anything.

2) What processes are in place to guarantee that you never miss a call to SafeDBString? Missing it in just 1 place could open up a whole host of issues. How much are you going to eyeball these things, and consider how much wasted that effort is when the accepted correct answer is so easy to reach.

3) How certain are you that you've covered off every attack vector that Microsoft(the author of the DB and the access library) knows about in your SafeDBString implementation ...

4) How easy is it to read the structure of the sql? The example uses + concatenation, parameters are very like string.Format, which is more readable.

Also, there are 2 ways of working out what was actually run - roll your own LogCommand function, a simple function with no security concerns, or even look at an sql trace to work out what the database thinks is really going on.

Our LogCommand function is simply:

    string LogCommand(SqlCommand cmd)
    {
        StringBuilder sb = new StringBuilder();
        sb.AppendLine(cmd.CommandText);
        foreach (SqlParameter param in cmd.Parameters)
        {
            sb.Append(param.ToString());
            sb.Append(" = \"");
            sb.Append(param.Value.ToString());
            sb.AppendLine("\"");
        }
        return sb.ToString();
    }

Right or wrong, it gives us the information we need without security issues.

Jim T
  • 12,336
  • 5
  • 29
  • 43
  • 1
    He's probably got to deal with a bunch of old VBSCRIPT programmers who are used to doing everything, including XML and SQL, through string concatenation. These will be people who are scared by the use of an API. There's nothing much that can be done with them, at least nothing humane. – John Saunders May 26 '09 at 16:34
  • 1
    +1 for item #2, with the exception that there's no way to enforce real parameters either. – Joel Coehoorn May 26 '09 at 17:11
7

With parameterised queries you get more than protection against sql injection. You also get better execution plan caching potential. If you use the sql server query profiler you can still see the 'exact sql that is run on the database' so you're not really losing anything in terms of debugging your sql statements either.

Steve Willcock
  • 26,111
  • 4
  • 43
  • 40
5

I have used both approaches to avoid SQL injection attacks and definitely prefer parametrized queries. When I have used concatenated queries I have used a library function to escape the variables (like mysql_real_escape_string) and wouldn't be confident I have covered everything in a proprietary implementation (as it seems you are too).

RedBlueThing
  • 42,006
  • 17
  • 96
  • 122
  • 2
    +1 because mysql_real_escape_string() escapes \x00, \x1a, \n \r ' and ". It also handles character set issues. The OP's coworkers naive function doesn't do any of that! – Bill Karwin May 26 '09 at 18:44
4

You aren't able to easily do any type checking of the user input without using parameters.

If you use the SQLCommand and SQLParameter classes to make you're DB calls, you can still see the SQL query that's being executed. Look at the SQLCommand's CommandText property.

I'm always a litle suspect of the roll-your-own approach to preventing SQL injection when parameterized queries are so easy to use. Second, just because "it's always been done that way" doesn't mean it's the right way to do it.

Tim Scarborough
  • 1,270
  • 1
  • 11
  • 22
3

This is only safe if you're guaranteed that you're going to pass in a string.

What if you're not passing in a string at some point? What if you pass just a number?

http://www.mywebsite.com/profile/?id=7;DROP DATABASE DB

Would ultimately become:

SELECT * FROM DB WHERE Id = 7;DROP DATABASE DB
joshcomley
  • 28,099
  • 24
  • 107
  • 147
  • It's either a string or a number. A string is escaped with SafeDbString. A number is an Int32, and it can't drop databases. – Andomar May 26 '09 at 13:21
  • Numbers are easier to handle. You just convert the parameter to an int/float/whatever before using it in the query. The problem is when you must accept string data. – Rune Grimstad May 26 '09 at 13:21
  • Andomar - if you're just constructing a SQL statement by hand then it's intended "type" doesn't matter, you can SQL inject with a number very, very easily. Rune - I think this is relying far too much on the individual developer to remember all the nuances of manually solving SQL injection. If you just say "use parameters" it's very simple and they can't go wrong. – joshcomley May 26 '09 at 13:24
  • @Andomar: What about NULL? Or strings that look like numbers? – Joel Coehoorn May 26 '09 at 13:38
2

For the reasons already given, parameters are a very good idea. But we hate using them because creating the param and assigning its name to a variable for later use in a query is a triple indirection head wreck.

The following class wraps the stringbuilder that you will commonly use for building SQL requests. It lets you write paramaterized queries without ever having to create a parameter, so you can concentrate on the SQL. Your code will look like this...

var bldr = new SqlBuilder( myCommand );
bldr.Append("SELECT * FROM CUSTOMERS WHERE ID = ").Value(myId, SqlDbType.Int);
//or
bldr.Append("SELECT * FROM CUSTOMERS WHERE NAME LIKE ").FuzzyValue(myName, SqlDbType.NVarChar);
myCommand.CommandText = bldr.ToString();

Code readability, I hope you agree, is greatly improved, and the output is a proper parameterized query.

The class looks like this...

using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;

namespace myNamespace
{
    /// <summary>
    /// Pour le confort et le bonheur, cette classe remplace StringBuilder pour la construction
    /// des requêtes SQL, avec l'avantage qu'elle gère la création des paramètres via la méthode
    /// Value().
    /// </summary>
    public class SqlBuilder
    {
        private StringBuilder _rq;
        private SqlCommand _cmd;
        private int _seq;
        public SqlBuilder(SqlCommand cmd)
        {
            _rq = new StringBuilder();
            _cmd = cmd;
            _seq = 0;
        }
        //Les autres surcharges de StringBuilder peuvent être implémenté ici de la même façon, au besoin.
        public SqlBuilder Append(String str)
        {
            _rq.Append(str);
            return this;
        }
        /// <summary>
        /// Ajoute une valeur runtime à la requête, via un paramètre.
        /// </summary>
        /// <param name="value">La valeur à renseigner dans la requête</param>
        /// <param name="type">Le DBType à utiliser pour la création du paramètre. Se référer au type de la colonne cible.</param>
        public SqlBuilder Value(Object value, SqlDbType type)
        {
            //get param name
            string paramName = "@SqlBuilderParam" + _seq++;
            //append condition to query
            _rq.Append(paramName);
            _cmd.Parameters.Add(paramName, type).Value = value;
            return this;
        }
        public SqlBuilder FuzzyValue(Object value, SqlDbType type)
        {
            //get param name
            string paramName = "@SqlBuilderParam" + _seq++;
            //append condition to query
            _rq.Append("'%' + " + paramName + " + '%'");
            _cmd.Parameters.Add(paramName, type).Value = value;
            return this; 
        }

        public override string ToString()
        {
            return _rq.ToString();
        }
    }
}
bbsimonbb
  • 27,056
  • 15
  • 80
  • 110
2

I'd use stored procedures or functions for everything, so the question wouldn't arise.

Where I have to put SQL into code, I use parameters, which is the only thing that makes sense. Remind the dissenters that there are hackers smarter than they are, and with better incentive to break the code that's trying to outsmart them. Using parameters, it's simply not possible, and it's not like it's difficult.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
  • Ok, how to do SQL injection using parameters? – John Saunders May 26 '09 at 13:21
  • @Saunders: Step 1 is to find a buffer overflow bug in the parameter-handling functionality of your DB. – Brian May 26 '09 at 15:04
  • 2
    Found one yet? In a commercial DB that's being pounded on by hundreds of thousands of hackers daily? One made by a software company known to have very deep pockets? You'd be able to quote the lawsuit by _name_ if this were possible. – John Saunders May 26 '09 at 16:27
  • 1
    Of course, if the SPROC uses concatenation and EXEC (instead of sp_ExecuteSQL) you're back in trouble... (I've seen it done wrong too many times to discount it...) – Marc Gravell May 27 '09 at 06:57
2

Agree hugely on the security issues.
Another reason to use parameters is for efficiency.

Databases will always compile your query and cache it, then re-use the cached query (which is obviously faster for subsequent requests). If you use parameters then even if you use different parameters the database will re-use your cached query as it matches based on the SQL string before binding the parameters.

If however you don't bind parameters then the SQL string changes on every request (that has different parameters) and it will never match what's in your cache.

Darren Greaves
  • 3,326
  • 4
  • 29
  • 32
1

Always use parameterized queries where possible. Sometimes even a simple input without the use of any weird characters can already create an SQL-injection if its not identified as a input for a field in the database.

So just let the database do its work of identifying the input itself, not to mention it also saves allot of trouble when you need to actually insert weird characters that otherwise would be escaped or changed. It can even save some valuable runtime in the end for not having to calculate the input.

VulstaR
  • 68
  • 6
1

2 years later, I recidivated... Anyone who finds parameters a pain is welcome to try my VS Extension, QueryFirst. You edit your request in a real .sql file (Validation, Intellisense). To add a parameter, you just type it directly into your SQL, starting with the '@'. When you save the file, QueryFirst will generate wrapper classes to let you run the query and access the results. It will look up the DB type of your parameter and map it to a .net type, which you will find as an input to the generated Execute() methods. Could not be simpler. Doing it the right way is radically quicker and easier than doing it any other way, and creating a sql injection vulnerability becomes impossible, or at least perversely difficult. There are other killer advantages, like being able to delete columns in your DB and immediately see compile errors in your application.

legal disclaimer : I wrote QueryFirst

Community
  • 1
  • 1
bbsimonbb
  • 27,056
  • 15
  • 80
  • 110
1

From the very short time I've had to investigate SQL injection problems, I can see that making a value 'safe' also means that you're shutting the door to situations where you might actually want apostrophes in your data - what about someone's name, eg O'Reilly.

That leaves parameters and stored procedures.

And yes, you should always try to implement code in the best way you know now - not just how its always been done.

quamrana
  • 37,849
  • 12
  • 53
  • 71
  • The double apostrophes will be translated by sql server into a single apostrophe, so O'Reilly would be translated into Name = 'O''Reilly' – Rune Grimstad May 26 '09 at 13:10
  • So is there a corresponding function to remove apostrophes when the user wants to see their data? – quamrana May 26 '09 at 15:39
  • No need. The escape sequence allows the parser to see a single quote rather than the end of the string. As it's parsing, it sees `''` as a literal `'`, so your string will be seen internally as the character sequence `O'Reilly`. That's what the DB will store, retrieve, compare against, etc. If you want to show the user their data after you've escaped it, then keep a copy of the unescaped string appside. – cHao Dec 11 '13 at 21:33
1

Here are a couple of articles that you might find helpful in convincing your co-workers.

http://www.sommarskog.se/dynamic_sql.html

http://unixwiz.net/techtips/sql-injection.html

Personally I prefer to never allow any dynamic code to touch my database, requiring all contact to be through sps (and not one which use dynamic SQl). This means nothing excpt what I have given users permission to do can be done and that internal users (except the very few with production access for admin purposes) cannot directly access my tables and create havoc, steal data or commit fraud. If you run a financial application, this is the safest way to go.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
1

I did not see any other answsers address this side of the 'why doing it yourself is bad', but consider a SQL Truncation attack.

There is also the QUOTENAME T-SQL function that can be helpful if you can't convince them to use params. It catches a lot (all?) of the escaped qoute concerns.

meh-uk
  • 2,031
  • 1
  • 23
  • 36
JasonRShaver
  • 4,344
  • 3
  • 32
  • 39
1

It can be broken, however the means depends on exact versions/patches etc.

One that has already been brought up is the overflow/truncation bug that can be exploited.

Another future means would be finding bugs similar to other databases - for example the MySQL/PHP stack suffered an escaping problem because certain UTF8 sequences could be used to manipulate the replace function - the replace function would be tricked into introducing the injection characters.

At the end of the day, the replacement security mechanism relies on expected but not intended functionality. Since the functionality was not the intended purpose of the code, there is a high probablity that some discovered quirk will break your expected functionality.

If you have a lot of legacy code, the replace method could be used as a stopgap to avoid lengthy rewriting and testing. If you are writing new code, there is no excuse.

David
  • 24,700
  • 8
  • 63
  • 83
0

Here are a few reasons to use parameterized queries:

  1. Security - The database access layer knows how to remove or escape items that are not allowed in data.
  2. Separation of concerns - My code is not responsible for transforming the data into a format that the database likes.
  3. No redundancy - I don't need to include an assembly or class in every project that does this database formatting/escaping; it's built in to the class library.
Powerlord
  • 87,612
  • 17
  • 125
  • 175
0

There were few vulnerability(I can't remember which database it was) that is related to buffer overflow of the SQL statement.

What I want to say is, SQL-Injection is more then just "escape the quote", and you have no idea what will come next.

Dennis C
  • 24,511
  • 12
  • 71
  • 99
0

Another important consideration is keeping track of escaped and unescaped data. There are tons and tons of applications, Web and otherwise, that don't seem to properly keep track of when data is raw-Unicode, &-encoded, formatted HTML, et cetera. It's obvious that it will become difficult to keep track of which strings are ''–encoded and which aren't.

It's also a problem when you end up changing the type of some variable — perhaps it used to be an integer, but now it's a string. Now you have a problem.

Paul Fisher
  • 9,618
  • 5
  • 37
  • 53