1

I have an example query with a Where Contains clause: db.jobs.Where(Function(j) ids.Contains(j.id))

My issue is that the real world array is over 100 in length, the generated SQL takes each element as it's own parameter, so the WHERE clause is something like IN (@p1....@p100).

Sending over 100 parameters to the database is I guess slow, what would be a better approach of doing this query?
What is the maximum number of parameters that can be used before performance is reduced?
Is there a way to tell LINQ to send the array as 1 parameter rather than 100 parameters?

Here is my full modified query:

Dim totMins = db.jobs _
   .Where(Function(j) ids.Contains(j.id)) _
   .GroupBy(Function(wo) wo.id) _
   .ToDictionary(Function(g) g.Key, Function(g) g.Sum(Function(wo) wo.DurationMin))

Before I had a forEach loop and did over 100 database calls to get each total, so I thought this Dictionary approach would be better.

Edit

As mentioned in my comments, I was having an issue with the SQL excecuting multiple times (seemingly indefinatly)
I solved the issue by projecting into an anonymous object and calling ToArray to execute the Linq instead.

Dim totMins = db.Jobs _
    .Where(Function(j) ids.Contains(j.id)) _
    .GroupBy(Function(j) j.id) _
    .Select(Function(g) New With {g.Key, .Sum = g.Sum(Function(j) j.DurationMin)}) _
    .ToArray() _
    .ToDictionary(Function(g) g.Key, Function(g) g.Sum)

I can only assume that ToArray and ToDictonary handle the deferred execution of Linq-to-SQL differently, and for some reason ToDictonary was performing multiple select queries

Luke T O'Brien
  • 2,565
  • 3
  • 26
  • 38
  • 1
    How did you come to conclusion that sending 100 parameters is slow? – Evaldas Buinauskas Jun 27 '17 at 11:22
  • If the execution is slow the only way to figure out why is by analyzing the execution plan for the actual query. Use Sql Profiler and capture the query as it is executed. Copy that query with parameter values to SSMS and get the Actual Execution Plan and analyze that. – Igor Jun 27 '17 at 11:23
  • If the ids are in a range you can use Between in the SQL query. Sending the array to a stored procedure is another solution. – jdweng Jun 27 '17 at 11:25
  • How many jobs are there (roughly)? I've once had a similar thing to do and it turned out that it was much faster to first get *all* jobs into a cache object and then do the filtering and sorting locally. This depends on the amount of data, of course. Another approach would be a parametrized stored procedure in your DB, you can easily use them with LINQ. – Rob Jun 27 '17 at 11:26
  • 1
    `Contains` isn't slow, as long as the `ID` column is indexed. Without an index *all* data access will be slow – Panagiotis Kanavos Jun 27 '17 at 11:29
  • BTW nothing is slower than trying to load items one by one in memory. It will be 100 times slower *at least*. That's because the *round-trip* takes more than the query itself. It will be a lot worse without indexes, as each query will have to scan the entire table. – Panagiotis Kanavos Jun 27 '17 at 11:31
  • @PanagiotisKanavos not sure about current version of EF, but `Contains` is known to be slow because of Entity Framework sql generation, not even because of `IN` query. As I remember, Contains against about 10K elements might took 10-15 seconds, of which only several milliseconds was actually executing query by database. – Evk Jun 27 '17 at 11:38
  • @Evk this [has changed](https://stackoverflow.com/a/7936350/134204) in EF6 - EF didn't use an `IN` statement until EF6. That was 4 years ago. The *current* version is EF Core with doesn't need .NET Core and includes other interesting tricks [like this one](https://www.brentozar.com/archive/2017/05/case-entity-framework-cores-odd-sql/) to batch inserts. This time it seems the trick works, and insert speed improves by 250% – Panagiotis Kanavos Jun 27 '17 at 11:53
  • @EvaldasBuinauskas I just thought it was. I have read https://visualstudiomagazine.com/articles/2010/06/24/five-tips-linq-to-sql.aspx?admgarea=features, Tip #4 Keep the Number of Parameters Down - I'm also having an issue with the query seems to be stuck in a loop and executing over and over, but I guess that is something else.
    Also just to say that we are using Linq to SQL not EF, don't know if there is a difference with the provider
    – Luke T O'Brien Jun 27 '17 at 12:07
  • *I guess* -- that's a feeble basis for a Stack Overflow question. Anyway, LINQ-to-SQL has a built-in limit of 2100 parameters. – Gert Arnold Jun 27 '17 at 12:18
  • Why not just send `Xml` into the database, it only requires one string to be sent in... Then you can pull out what you need and only make one call... – Trevor Jun 27 '17 at 13:14
  • @Codexer I did think about concatenating a string, but then how do I get Linq to generate the SQL? – Luke T O'Brien Jun 27 '17 at 14:00
  • Why do you need Linq to generate it, let the DBMS generate what you would need and spit it out. Then theres no need to do any Linq... – Trevor Jun 27 '17 at 15:23

2 Answers2

2

Sending over 100 parameters to the database is I guess slow

Sending 100 parameters to database is not slower than any other approach that you may take, because you would need to transfer the data for these parameters one way or the other.

I would not worry about this problem until the number of elements starts climbing into a few thousands. After that I would bypass LINQ and use table-valued parameters for the query.

Before I had a forEach loop and did over 100 database calls to get each total, so I thought this Dictionary approach would be better.

A roundtrip to RDBMS costs a lot more than sending an extra parameter - probably, several orders of magnitude. With very high level of certainty, any solution that reduces the number of roundtrips would be an improvement.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
0

How stable is the data in your ids array?

If it's constant (or at least, stays consistent over several queries), then you may want to consider putting it into a table and doing a join:

  from j in db.jobs
  join i in db.ids on j.id equals i.id
  select j;
James Curran
  • 101,701
  • 37
  • 181
  • 258