6

I get a list of ID's and amounts from a excel file (thousands of id's and corresponding amounts). I then need to check the database to see if each ID exists and if it does check to make sure the amount in the DB is greater or equal to that of the amount from the excel file.

Problem is running this select statement upwards of 6000 times and return the values I need takes a long time. Even at a 1/2 of a second a piece it will take about an hour to do all the selects. (I normally dont get more than 5 results max back)

Is there a faster way to do this?

Is it possible to somehow pass all the ID's at once and just make 1 call and get the massive collection?

I have tried using SqlDataReaders and SqlDataAdapters but they seem to be about the same (too long either way)

General idea of how this works below

for (int i = 0; i < ID.Count; i++)
{
    SqlCommand cmd = new SqlCommand("select Amount, Client, Pallet from table where ID = @ID and Amount > 0;", sqlCon);

    cmd.Parameters.Add("@ID", SqlDbType.VarChar).Value = ID[i];

    SqlDataAdapter da = new SqlDataAdapter(cmd);

    da.Fill(dataTable);
    da.Dispose();           
}
Craig W.
  • 17,838
  • 6
  • 49
  • 82
user2755680
  • 85
  • 11

12 Answers12

13

Instead of a long in list (difficult to parameterise and has a number of other inefficiencies regarding execution plans: compilation time, plan reuse, and the plans themselves) you can pass all the values in at once via a table valued parameter.

See arrays and lists in SQL Server for more details.

Generally I make sure to give the table type a primary key and use option (recompile) to get the most appropriate execution plans.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
5

Combine all the IDs together into a single large IN clause, so it reads like:

select Amount, Client, Pallet from table where ID in (1,3,5,7,9,11) and Amount > 0;
Niels Keurentjes
  • 41,402
  • 9
  • 98
  • 136
  • That would be my suggestion, but the implementation using SqlCommand is not trivial since Parameters only take single values. You have to be careful to avoid a potential sql injection vulnerability. – AaronLS Sep 03 '14 at 20:43
  • @AaronLS not a fan of this inline SQL but performance is a big issue currently – user2755680 Sep 03 '14 at 20:51
  • 2
    @user2755680 You can build the SQL dynamically, but insert values through Parameter objects, which is safer than directly inserted the values into the string, like this: http://stackoverflow.com/questions/400819/using-an-in-operator-with-a-sql-command-object-and-c-sharp-2-0 – AaronLS Sep 03 '14 at 20:53
  • 1
    @AaronLS I don't know from the question whether the input is trusted or not. The answer is intended to be conceptional - `IN` is the way to go, how to implement that depends on his specific case. Dynamic parameters are likely best though, but for entirely trusted input a joined string will be faster. – Niels Keurentjes Sep 03 '14 at 20:56
  • 2
    For the 6,000 figure mentioned in the question you'd be out of luck with parameters anyway. IIRC the parameter limit is something like 2,000. Very long in lists are specifically warned against in the documentation as potentially leading to Error 8623 or 8632. – Martin Smith Sep 03 '14 at 21:08
2

"I have tried using SqlDataReaders and SqlDataAdapters"

It sounds like you might be open to other APIs. Using Linq2SQL or Linq2Entities:

var someListIds = new List<int> { 1,5,6,7 }; //imagine you load this from where ever
db.MyTable.Where( mt =>  someListIds.Contains(mt.ID) );

This is safe in terms of avoiding potential SQL injection vulnerabilities and will generate a "in" clause. Note however the size of the someListIds can be so large that the SQL query generated exceeds limits of query length, but the same is true of any other technique involving the IN clause. You can easily workaround that by partitioning lists into large chunks, and still be tremendously better than a query per ID.

AaronLS
  • 37,329
  • 20
  • 143
  • 202
2

Use Table-Valued Parameters

With them you can pass a c# datatable with your values into a stored procedure as a resultset/table which you can join to and do a simple:

SELECT * 
FROM YourTable 
WHERE NOT EXISTS (SELECT * FORM InputResultSet WHERE YourConditions)
Mladen Prajdic
  • 15,457
  • 2
  • 43
  • 51
1

Use the in operator. Your problem is very common and it has a name: N+1 performance problem

Where are you getting the IDs from? If it is from another query, then consider grouping them into one.

Community
  • 1
  • 1
BCartolo
  • 720
  • 4
  • 21
1

Rather than performing a separate query for every single ID that you have, execute one query to get the amount of every single ID that you want to check (or if you have too many IDs to put in one query, then batch them into batches of a few thousand).

Servy
  • 202,030
  • 26
  • 332
  • 449
1

Import the data directly to SQL Server. Use stored procedure to output the data you need.

If you must consume it in the app tier... use xml datatype to pass into a stored procedure.

sam yi
  • 4,806
  • 1
  • 29
  • 40
  • if you've got the data in SQL server, the answer can be got with a pretty simple query, no need to write a stored procedure – Jonny Cundall Sep 03 '14 at 20:54
1

You can import the data from the excel file into SQL server as a table (using the import data wizard). Then you can perform a single query in SQL server where you join this table to your lookup table, joining on the ID field. There's a few more steps to this process, but it's a lot neater than trying to concatenate all the IDs into a much longer query.

I'm assuming a certain amount of access privileges to the server here, but this is what I'd do given the access I normally have. I'm also assuming this is a one off task. If not, the import of the data to SQL server can be done programmatically as well

Jonny Cundall
  • 2,552
  • 1
  • 21
  • 33
1

IN clause has limits, so if you go with that approach, make sure a batch size is used to process X amount of Ids at a time, otherwise you will hit another issue.

A @Robertharvey has noted, if there are not a lot of IDs and there are no transactions occurring, then just pull all the Ids at once into memory into a dictionary like object and process them there. Six thousand values is not alot and a single select could return all those back within a few seconds.

Just remember that if another process is updating the data, your local cached version may be stale.

Jon Raynor
  • 3,804
  • 6
  • 29
  • 43
  • 6 thousand is the number they want to match. They say in the comments there could be millions in the DB. – Martin Smith Sep 03 '14 at 21:16
  • If that's the case , import the data into a staging table and then select it all via a join. After processing, truncate the staging table. taht would be the one query to return the 6000 or so rows. – Jon Raynor Sep 03 '14 at 21:23
1

There is another way to handle this, Making XML of IDs and pass it to procedure. Here is code for procedure.

IF OBJECT_ID('GetDataFromDatabase') IS NOT NULL
    BEGIN
        DROP PROCEDURE GetDataFromDatabase
    END
GO

--Definition
CREATE PROCEDURE GetDataFromDatabase
@xmlData XML
AS
BEGIN
DECLARE @DocHandle INT
DECLARE @idList Table (id INT)

EXEC SP_XML_PREPAREDOCUMENT  @DocHandle OUTPUT, @xmlData;   

INSERT INTO @idList (id) SELECT x.id FROM OPENXML(@DocHandle, '//data', 2) WITH ([id] INT) x

EXEC SP_XML_removeDOCUMENT  @DocHandle ;
--SELECT * FROM @idList

SELECT t.Amount, t.Client, t.Pallet FROM yourTable t INNER JOIN  @idList x ON t.id = x.id and t.Amount > 0;
END
GO

--Uses
EXEC GetDataFromDatabase @xmlData = '<root><data><id>1</id></data><data><id>2</id></data></root>'

You can put any logic in procedure. You can pass id, amount also via XML. You can pass huge list of ids via XML.

Dinesh Maind
  • 349
  • 1
  • 7
0

You can select the whole resultset (or join multiple 'limited' result sets) and save it all to DataTable Then you can do selects and updates (if needed) directly on datatable. Then plug new data back... Not super efficient memory wise, but often is very good (and only) solution when working in bulk and need it to be very fast. So if you have thousands of records, it might take couple of minutes to populate all records into the DataTable

then you can search your table like this:

string findMatch = "id = value";
DataRow[] rowsFound = dataTable.Select(findMatch);

Then just loop foreach (DataRow dr in rowsFound)

Andrew
  • 7,619
  • 13
  • 63
  • 117
  • This may not be an option if the total amount of data wouldn't fit into memory, and would be excessive and inefficient if only a small percentage of the items in the list are actually checked. – Servy Sep 03 '14 at 20:42
  • Well; This could be only option when you have to work with millions or records and you need performance. Also, there is a swap file and you need to tune up your servers to match your tasks. – Andrew Sep 03 '14 at 20:43
  • It's certainly not the only option. It's not even the best option, but it is dramatically better than the OP's approach if there is a large number of IDs. – Servy Sep 03 '14 at 20:45
0

SqlDataAdapter objects too heavy for that. Firstly, using stored procedures, it will be faster. Secondly, use the group operation, for this pass as a parameter to a list of identifiers on the side of the database, run a query on these parameters, and return the processed result. It will quickly and efficiently, as all data processing logic is on the side of the database server

Anton
  • 11