2

I'm using Entity Framework 6.0 with SQL Server localdb, I tried using the Enumerable.Contains feature, but I found it very slow when we have a big list.

I researched and found that there is a recommendation to use sql statements for this case, as described in:Why is .Contains slow? Most efficient way to get multiple entities by primary key?.

Now I wonder why the following statement is so slow, taking more than 180 seconds to run the "ToList ()"?

using (var context = new MyDbContext())
{
    Random rand = new Random();
    var ids = new List<int>();
    for (int i = 0; i < 20000; i++)
        ids.Add(rand.Next(550000));

    Stopwatch watch = new Stopwatch();
    watch.Start();

    var values = new StringBuilder();
    values.AppendFormat("{0}", ids[0]);
    for (int i = 1; i < ids.Count; i++)
        values.AppendFormat(", {0}", ids[i]);

    var sql = string.Format(
        "SELECT * FROM [MyDb].[dbo].[MyEntities] WHERE NOT [ID] IN ({0})",
        values);

    var result = context.Set<MyEntity>().SqlQuery(sql).ToList();

    watch.Stop();
    var msec = watch.ElapsedMilliseconds;
}

Note that the statement is similar to the link test3 above, except that we included the term "NOT".

Sorry my bad English^^.

Community
  • 1
  • 1
Zeus-Adenilton
  • 723
  • 5
  • 12

2 Answers2

1

Sending a very large filter condition to the database like you are doing can be very slow. Depending on the size of the table it can be much faster to retrieve all the values and do the filtering in memory instead using a HashSet. Try this:

Random rand = new Random();
var set= new HashSet<int>();
for (int i = 0; i < 20000; i++)
    set.Add(rand.Next(550000));

Stopwatch watch = new Stopwatch();
watch.Start();

var sql = "SELECT * FROM [MyDb].[dbo].[MyEntities]";

var result = context.Set<MyEntity>()
             .SqlQuery(sql)
             .AsEnumerable()
             .Where(x => !set.Contains(x.ID))
             .ToList();

watch.Stop();
Magnus
  • 45,362
  • 8
  • 80
  • 118
  • If i execute the statement directly in sql server, in the first time ir execute in 11 seconds, but i will test your suggestion. – Zeus-Adenilton Jan 23 '15 at 13:45
  • Also if you table contains many columns only retrive the ones you actually need, it can save a lot of time using this solution. – Magnus Jan 23 '15 at 13:47
  • The table has 19 columns. Please know tell me why this absurd difference runtime? If I run directly in sql server: 11 seconds. The entity framework using sql statement: 226 seconds. – Zeus-Adenilton Jan 23 '15 at 13:50
  • @Zeus-Adenilton Let me know which timings you got with this solution. – Magnus Jan 23 '15 at 13:54
  • Your example runs in 5 seconds, due to the number of records in this table is only 10399. Do you have any suggestions for when the number of rows is much higher? – Zeus-Adenilton Jan 23 '15 at 14:29
  • @Zeus-Adenilton It largely depends on the the size of each row. If small this solution should work fine on very large tables. So only retrieve the columns you actually need. – Magnus Jan 23 '15 at 14:43
  • I will use this alternative, but it is not what I expect from the Entity Framework. This is basic select any DBMS runs in seconds, for this scenario. Will bug? – Zeus-Adenilton Jan 23 '15 at 16:21
0

When using EF to run your queries the execution time combines of several things:

  • time to build and translate the query
  • time to run the query and receive results
  • time to materialize query results into entities

By using SQL directly with SqlQuery method you are omitting the time to build the query and translate it into SQL.

In your case this time will be relatively small compared to query execution and result materialization (which can take quite a lot of time if there are too many results to materialize).

You should check how much time does it take to run the query directly and how many results does it return (if there are too many, like thousands of results, then you should try to minimize the query result or use paging to get the data in relatively small chunks).

Vsevolod Goloviznin
  • 12,074
  • 1
  • 49
  • 50