8
bool isEmployeeFound;
DbContext DatabaseContext = new DbContext(DatabaseConnectionString);
using (DatabaseContext )
{
    isEmployeeFound= DatabaseContext .Persons.Any(p => p.ExternalId == "123"); -- 1st statement
    isEmployeeFound= DatabaseContext .Persons.Count(p => p.ExternalId == "123") > 0; --2nd statement
}

My requirement is to check only If a given employee Id; exist in a table or not. I do not want the rows from table just a true or false. I am using Entity Framework not LINQ to Objects.

I have been reading about Any and Count and kind of not able to decide; which one should I use? As shown above in my code should I use 1st or 2nd statement?

I read that using Any(It converts to Exists in SQL) is faster because as soon it satisfied the condition it stops iterating and returns the result whereas Count()(It converts to select Count(*) in SQL) iterates all and then return the result. However this post Which method performs better: .Any() vs .Count() > 0? says Count() is optimized for Linq to objects and it performs better than Any.

I did explore some more and tried getting times using code snippet below

using (var _dbContext = new DbContext())
{
    string caregiverId = "2301001";
    string clientPhoneNumber = "9795397674";

    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();

    bool iscareGiverFoundWithAny = _dbContext.Persons.Any(p => p.ExternalId == caregiverId);
    bool isClientPhoneNumberFoundWithAny = _dbContext.PhoneNumbers.Any(ph => ph.Number == clientPhoneNumber);

    var testResult1 = stopwatch.Elapsed;
    stopwatch.Restart();

    bool iscareGiverFoundWithCountExt = _dbContext.Persons.Count(p => p.ExternalId == caregiverId) > 0;
    bool isClientPhoneNumberFoundWithCountExt = _dbContext.PhoneNumbers.Count(ph => ph.Number == clientPhoneNumber) > 0;

    var testResult2 = stopwatch.Elapsed;
    stopwatch.Stop();

    Console.WriteLine("Any " + testResult2.TotalSeconds); -- value for this line is coming as 0.0276239
    Console.WriteLine("Count Ext. " + testResult3.TotalSeconds); -- value for this line is coming as 0.0054292
    Console.ReadLine();
}

On running above code it shows Count is faster. I am confused.

Thoughts please?

Community
  • 1
  • 1
user1955255
  • 219
  • 4
  • 18
  • 2
    Why don't you use the SQL Profiler or Express Profiler to check what SQL clause is producing depending on how you build your LINQ queries? – Matías Fidemraizer Jun 03 '15 at 12:02
  • For Any Query below is the sql DECLARE @p__linq__0 NVarChar(1000) = '2301001' -- EndRegion SELECT CASE WHEN ( EXISTS (SELECT 1 AS [C1] FROM [dbo].[Persons] AS [Extent1] WHERE ([Extent1].[ExternalId] = @p__linq__0) OR (([Extent1].[ExternalId] IS NULL) AND (@p__linq__0 IS NULL)) )) THEN cast(1 as bit) ELSE cast(0 as bit) END AS [C1] FROM ( SELECT 1 AS X ) AS [SingleRowTable1] – user1955255 Jun 03 '15 at 12:06
  • For Count query below is the sql -- Region Parameters DECLARE @p__linq__0 NVarChar(1000) = '2301001' -- EndRegion SELECT [GroupBy1].[A1] AS [C1] FROM ( SELECT COUNT(1) AS [A1] FROM [dbo].[Persons] AS [Extent1] WHERE ([Extent1].[ExternalId] = @p__linq__0) OR (([Extent1].[ExternalId] IS NULL) AND (@p__linq__0 IS NULL)) ) AS [GroupBy1] – user1955255 Jun 03 '15 at 12:08
  • I told you that you can check the resulting queries to get the answer youself :D – Matías Fidemraizer Jun 03 '15 at 12:09
  • :-) I can see Any uses exists and hence it should be faster but I am confused when other post(http://stackoverflow.com/questions/305092/which-method-performs-better-any-vs-count-0) says Count() is optimized to use .Count in case if I use count() extension method. – user1955255 Jun 03 '15 at 12:43
  • 1
    Reading the query Any should be faster so is very strange what you're saying. You could also paste the query and check the execution plan... EXISTS should have the same execution plan but should stop at first found record while COUNT should count every record (also if there is an index they could be on different leaves). I think that if you forget the index (on ExternalId) or there is "a lot of" records EXISTS is faster. Consider also that you're running Any before Count so SQL could have everything in memory when you run count... – bubi Jun 03 '15 at 16:15

2 Answers2

8

The short answer: stick with Any().

You can safely ignore the post you are referencing, because it isn't specific to Entity Framework. When using LINQ on top of simple collections, then yes, there may be other considerations. But when using LINQ to query a database, then you want to avoid iterating through extra database records unnecessarily.

In this case, you say that Count() proved to be faster than Any(). But the difference you experienced is so minimal, at least in database performance terms, that you could say that you got equivalent performance in this case.

And in effect, if your table is small, or if the column you are searching on is properly indexed and returns very few records, then you can expect the performance to be fairly similar with both Any() and Count().

But let's say you have a big table, and your ExternalId column is not indexed, then you would inevitably notice Any() greatly outperform Count().

The point: Best case scenario (if your data model is optimized), both options may perform similarly. Worst case, Any() will absolutely outperform Count().

There is never a scenario where Count() will vastly outperform Any(), unless EF introduces a serious SQL generation bug. So to be safe, I recommend you stick with Any().

sstan
  • 35,425
  • 6
  • 48
  • 66
0

In this case

.Any(p => p.ExternalId == "123");
.Count(p => p.ExternalId == "123") > 0;

Any would still perform better.

Personally I would recomend you try efprof from hibernatingrhinos to get the executed sql and timings.

Margus
  • 19,694
  • 14
  • 55
  • 103
  • If you happen to use **.Count(p => p.ExternalId == "123") > 0;** and you have Reshaper installed, it actually recommends you change to **any**. – Margus Jun 03 '15 at 13:07