1

I have a SQL query I'm running in an ASP.NET page. The final parsed SQL needs to contain a list of string values in the WHERE [columnname] IN [values] format. For example, the final query might look something like this:

SELECT PRODUCTNAME FROM PRODUCT WHERE PRODUCTCODE IN ('ABC','DEF','GHI','JKL', /* etc */);

However, the string values in the WHERE clause need to be dynamic. Normally I use parametrized queries to make my code convenient and safe, so conceptually I'd like to do something like this:

String[] productCodes = { "ABC", "DEF", "GHI", "JKL" };
SqlCommand cmd = "SELECT PRODUCTNAME FROM PRODUCT WHERE PRODUCTCODE IN (@ProductCodes)";
cmd.Parameters.Add("@ProductCodes", productCodes);

However, this sort of functionality doesn't appear to exist in .NET. How should I go about implementing this? I could use a foreach loop on the array and run a query with a single value as a parameter for each value, but there could potentially be a hundred or so different values in the array and it seems like querying them separately would be very inefficient.

I've read another question where someone suggested a solution for strongly-typed int parameters, but that method would make me nervous about SQL injection when used with String values, especially since the client may very well be able to influence the input values.

How would you implement this query functionality?

EDIT with DB Details:

The database is SQL Server 2005. Sorry for forgetting to mention that.

Community
  • 1
  • 1
Joshua Carmody
  • 13,410
  • 16
  • 64
  • 83

4 Answers4

2

Create your base sql statement as a format, and add the parameters dynamically, and then set the values in a loop.

String[] productCodes = { "ABC", "DEF", "GHI", "JKL" };
string sqlFormat = "SELECT PRODUCTNAME FROM PRODUCT WHERE PRODUCTCODE IN ({0})";
var @params = productCodes.Select((id, index) => String.Format("@id{0}", index)).ToArray();
var sql = String.Format(sqlFormat, string.Join(",", @params));

using(var cmd = new DbCommand(sql))
{
    for (int i = 0; i < productCodes.Length; i++)
        cmd.Parameters.Add(new Parameter(@params[i], DbType.String, productCodes[i]));
    // execute query
}
NerdFury
  • 18,876
  • 5
  • 38
  • 41
  • Cool. Still looking to see if there's a less messy method, but this looks like a good idea. Like arnehehe's answer, but without the vulnerability I didn't notice at first. – Joshua Carmody Mar 11 '11 at 21:22
1

Just an idea:

String[] productCodes = { "ABC", "DEF", "GHI", "JKL" };
SqlCommand cmd = "SELECT PRODUCTNAME FROM PRODUCT WHERE PRODUCTCODE IN ("; 

for (int i=0;i<productCodes.Length;i++) {
  cmd.CommandText += "@" + productCodes[i] + ",";
  cmd.Parameters.Add("@" + productCodes[i], productCodes[i]);
}
cmd.CommandText = cmd.CommandText.SubString(0, cmd.CommandText.Length-1);
cmd.CommandText += ");"

Probably not the best way, but I guess this is how I would try it.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
arnehehe
  • 1,386
  • 1
  • 17
  • 33
  • That's interesting. It's a bit messy, but it looks a lot safer than other things I've considered so far. I might go that route if nobody has a better idea. Thanks. – Joshua Carmody Mar 11 '11 at 21:08
  • 1
    Please don't do this. It's an sql injection attack waiting to happen - maybe not right away, because the product codes are likely themselves sourced from the database, but eventually you'll end up with a 2nd order vulnerability here – Joel Coehoorn Mar 11 '11 at 21:09
  • if the command text parameters are generated without using the values in the names, does that improve the scenario? Isn't this effectively what Linq2Sql and EF do when generating this type of query? Use a bunch of @p1, @p2, @p3, etc parameters, and then assign those values? – NerdFury Mar 11 '11 at 21:14
  • @Joel Coehoorn - Well, it looks like he's using the value as the name of the parameter to construct the command and THAT'S not a good idea. But it you replace `cmd.CommandText += "@" + productCodes[i]` with `cmd.CommandText += "@p" + i.toString()` I would think that would be secure? – Joshua Carmody Mar 11 '11 at 21:20
  • Yeah sorry, I definitely wouldn't use the value as parameter name. Doesn't make sense at all. I guess I just sort of posted it as a potential idea – arnehehe Mar 11 '11 at 21:24
0

You'rel looking for table-valued parameters. However, these weren't available for sql server until well after asp.net was widely adopted, and so the support for them in asp.net is limited.

What I recommend instead is to think of it like building a shopping cart. Users add items to the cart, and at some point you want to display all the items in the cart. In this case, the natural solution is that the shopping cart itself is in a database table rather. Rather than pulling down the items for the card to include as part of an "IN (?)" directive, you build this as a subquery for your directive instead: "WHERE X IN (SELECT X FROM ShoppingCart WHERE UserID= @UserID AND SessionKey= @SessionKey)". Even if you have hundreds of items added, the user is only so going to be so fast and the load per-insert is spread fairly evenly.

Of course, you're probably building something other than a shopping cart, but nevertheless your query will almost always fall into one of three categories:

  1. The items for your list are selected by the user by hand, in which case you can still have each selection result in a new database record that can in turn be used in a subquery
  2. It's data that's already available in your database, in which case you should still be able to use a sub query (and if you can't, it might be time to add a "category" column somewhere so that you can).
  3. It's data that's hard coded into your app, in which case you can also code it into your query

The rare exception is when the query is triggered by another machine source, or you may also have a lot of code that you are reluctant to re-work to make this possible.
So if, for whatever reason, this approach doesn't cut it for you, the standard article on the subject for sql server (including a few alternatives) can be found here:
http://www.sommarskog.se/arrays-in-sql.html

This article really is the standard work on the subject, and is well worth your time.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
0

For the record, querying the hundred or so values in the array "not separately" could also be inefficient (though not as inefficient as a hundred or so round trips to the SQL Server). However, parameterizing the query helps a bit because you can prepare the same query and execute it several times. A stored procedure could be even better.

Any chance you can use Linq, or are you pre-.NET-3.5?

If you can't use Linq, and you absolutely must go this route, try the following:

SqlCommand cmd = "SELECT PRODUCTNAME FROM PRODUCT WHERE PRODUCTCODE = @ProductCode";
cmd.Prepare();
List<string> results;
foreach (string code in productCodes)
{
    cmd.Parameters.Clear();
    cmd.Parameters.Add("@ProductCodes", DbType.VarChar).Value = code;
    cmd.ExecuteQuery();
    // Add code here to add the returned values to the results list.  It's been
    // a while since I've used ADO.NET, and I don't have time to look it up
    // at the moment...
}

And then, you have your list of results.

Andrew
  • 14,325
  • 4
  • 43
  • 64
  • It's .NET 2.0. I tried to install 3.5 or 4.0 on the server (Windows 2003 Server, 64-bit with IIS running in 32-bit) and it caused all kinds of problems (long story). The network administrator and I agreed that I would code in .NET 2.0 until he figures out how to make 4.0 play nice with all of our different sites running on that machine. – Joshua Carmody Mar 11 '11 at 21:10
  • I mentioned the "foreach" possibility in my question. I think executing the query in a loop would be too inefficient when dealing with 100+ values. – Joshua Carmody Mar 11 '11 at 21:13
  • @Joshua Carmody Yes, you did mention that. However, it didn't appear that you were doing any sort of prepare/execute logic. This sort of scenario is one of the reasons for prepare/execute logic in the first place. – Andrew Mar 11 '11 at 21:22
  • Yeah, I understand about statement preparation. But I was under the impression that if I ran a SELECT statement with a "WHERE COLUMN IN (VALUES)" clause on a non-indexed column, SQL Server would only search through the table once, whereas if I performed a prepared statement in a loop searching a non-indexed column, SQL Server would search the entire table every single iteration through the loop. Is that not true? If it is, I think I'd better avoid it. – Joshua Carmody Mar 11 '11 at 21:29
  • @Joshua Carmody Honestly, in SQL Server 2005, I'm not entirely sure. I know that I've had better luck with performance on SQL Server generally when avoiding IN by just about any means possible. That said, if this column isn't indexed, you have bigger problems anyway. With an index, the performance hit shouldn't be too bad. Without the index, this is going to be slow no matter which method you use, especially with a hundred or more values being queried at once. – Andrew Mar 11 '11 at 21:32