27

A question posted earlier got me thinking. Would Any() and Count() perform similarly when used on an empty list?

As explained here, both should go through the same steps of GetEnumerator()/MoveNext()/Dispose().

I tested this out using quick program on LINQPad:

static void Main()
 {
    var list = new List<int>();

    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();

    for (int i = 0; i < 10000; i++)
        list.Any();

    stopwatch.Stop();
    Console.WriteLine("Time elapsed for Any()   : {0}", stopwatch.Elapsed);


    stopwatch = new Stopwatch();
    stopwatch.Start();

    for (int i = 0; i < 10000; i++)
        list.Count();

    stopwatch.Stop();
    Console.WriteLine("Time elapsed for Count(): {0}", stopwatch.Elapsed);
}

And the general result seems to indicate that Count() is faster in this situation. Why is that?

I'm not sure if I got the benchmark right, I would appreciate any correction if not.


Edit: I understand that it would make more sense semantically. The first link I've posted in the question shows a situation where it does make sense to do use Count() directly since the value would be used, hence the question.

Community
  • 1
  • 1
Vimal Stan
  • 2,007
  • 1
  • 12
  • 14
  • They would both be very fast but, if testing a `List` just use the `Count` property, rather than the `Count()` extension, that requires no enumeration. – Jodrell Apr 24 '13 at 11:11
  • 7
    What exactly did your benchmarks show? I'd expect that calling either of these just 10000 times would be so fast as to not be sensibly measurable. – Jon Skeet Apr 24 '13 at 11:11
  • 3
    The reason `Any` is better generally is because it only needs to find one thing in the enumeration, but count needs to find all of them. In your test the list is empty so obviously finding the first, and finding all doesnt make much difference – George Duckett Apr 24 '13 at 11:11
  • 5
    `note that the LINQ-to-Objects implementation of Count() does check for ICollection (using .Count as an optimisation) - so if your underlying data-source is directly a list/collection, there won't be a huge difference` From http://stackoverflow.com/questions/305092/which-method-performs-better-any-vs-count-0 – Habib Apr 24 '13 at 11:11
  • 2
    Relevant: A post I wrote a long time ago about understanding the differences of `Any` vs `Count`, when to use them (and more importantly, when *not* to use them): http://www.caspershouse.com/post/Anything-Counts.aspx – casperOne Apr 24 '13 at 11:12
  • 4
    Actually you are already linking to the question which answers this. As also mentioned there, IMO it is much better to prefer whatever is semantically more accurate (unless you are on a CPU-bound critical path, but that's so rare). – Jon Apr 24 '13 at 11:13
  • @JonSkeet The values displayed were very small, but there is a consistent difference. Here's one output `Time elapsed for Any() : 00:00:00.0004658` & `Time elapsed for Count(): 00:00:00.0001871` – Vimal Stan Apr 24 '13 at 11:13
  • `Count()` on a `ICollection` will just use the `Count` property. – Tim Schmelter Apr 24 '13 at 11:14
  • 1
    I added a warm up and changed it to 100000000 iterations. `Time elapsed for Any() : 00:00:03.7086543`, `Time elapsed for Count(): 00:00:01.6072861 ` – Blorgbeard Apr 24 '13 at 11:15
  • 1
    Does the LINQ method Count() on a list just use the Count property? In which case I wouldn't be surprised if it is quicker given that it is just referencing a property rather than interacting with the enumerator – Jack Hughes Apr 24 '13 at 11:15
  • 8
    While theoretically interesting, I'd suggest that this question has little value anyway: if you already know the list is empty, you don't need to call either method; if you don't know, you should prefer `Any` over `Count`. – Dan Puzey Apr 24 '13 at 11:17
  • Although it's an interesting question, performance is no key concern in this case; while showing business intend is. Therefore, Any() is the recommended approach. – L-Four Apr 24 '13 at 11:46
  • @casperOne Your answer was valid? Why the deletion? – Mathew Thompson Apr 24 '13 at 12:03
  • @mattytommo I didn't feel that it directly answered the question, it answered the general question that was referenced in the question this was incorrectly closed as a duplicate of. – casperOne Apr 24 '13 at 12:21
  • I think you need to compare `Any()` with `Count() == 0` instead of just `Count()`... but would that make a difference for the performance results? – Philipp M Apr 24 '13 at 12:39
  • 1
    The question was why Count() was faster than Any() when it's normally the other way around. – Vimal Stan Apr 24 '13 at 12:49

1 Answers1

22

The Count() method is optimized for ICollection<T> type, so the pattern GetEnumerator()/MoveNext()/Dispose() is not used.

list.Count();

Is translated to

((ICollection)list).Count;

Whereas the Any() has to build an enumerator. So the Count() method is faster.

Here a benchmarks for 4 differents IEnumerable instance. The MyEmpty looks like IEnumerable<T> MyEmpty<T>() { yield break; }

iterations : 100000000

Function                      Any()     Count()
new List<int>()               4.310     2.252
Enumerable.Empty<int>()       3.623     6.975
new int[0]                    3.960     7.036
MyEmpty<int>()                5.631     7.194

As casperOne said in the comment, Enumerable.Empty<int>() is ICollection<int>, because it is an array, and arrays are not good with the Count() extension because the cast to ICollection<int> is not trivial.

Anyway, for a homemade empty IEnumerable, we can see what we expected, that Count() is slower than Any(), due to the overhead of testing if the IEnumerable is a ICollection.

Complete benchmark:

class Program
{
    public const long Iterations = (long)1e8;

    static void Main()
    {
        var results = new Dictionary<string, Tuple<TimeSpan, TimeSpan>>();
        results.Add("new List<int>()", Benchmark(new List<int>(), Iterations));
        results.Add("Enumerable.Empty<int>()", Benchmark(Enumerable.Empty<int>(), Iterations));
        results.Add("new int[0]", Benchmark(new int[0], Iterations));
        results.Add("MyEmpty<int>()", Benchmark(MyEmpty<int>(), Iterations));

        Console.WriteLine("Function".PadRight(30) + "Any()".PadRight(10) + "Count()");
        foreach (var result in results)
        {
            Console.WriteLine("{0}{1}{2}", result.Key.PadRight(30), Math.Round(result.Value.Item1.TotalSeconds, 3).ToString().PadRight(10), Math.Round(result.Value.Item2.TotalSeconds, 3));
        }
        Console.ReadLine();
    }

    public static Tuple<TimeSpan, TimeSpan> Benchmark(IEnumerable<int> source, long iterations)
    {
        var anyWatch = new Stopwatch();
        anyWatch.Start();
        for (long i = 0; i < iterations; i++) source.Any();
        anyWatch.Stop();

        var countWatch = new Stopwatch();
        countWatch.Start();
        for (long i = 0; i < iterations; i++) source.Count();
        countWatch.Stop();

        return new Tuple<TimeSpan, TimeSpan>(anyWatch.Elapsed, countWatch.Elapsed);
    }

    public static IEnumerable<T> MyEmpty<T>() { yield break; }
}
Community
  • 1
  • 1
Cyril Gandon
  • 16,830
  • 14
  • 78
  • 122
  • Using `Enumerable.Empty()` shows that `Any()` is faster than `Count()`. Thanks! – Vimal Stan Apr 24 '13 at 12:07
  • 6
    -1: Your test is incorrect. `Enumerable.Empty()` returns an empty array, and arrays implement `IList` which extends `ICollection`. You need a method that does nothing but `yield break`. The call essentially sniffs out the same code path in your test. – casperOne Apr 24 '13 at 15:45