43

Anyone know any speed differences between Where and FindAll on List. I know Where is part of IEnumerable and FindAll is part of List, I'm just curious what's faster.

Dested
  • 6,294
  • 12
  • 51
  • 73
  • 2
    possible duplicate of [FindAll vs Where extension-method](http://stackoverflow.com/questions/1531702/findall-vs-where-extension-method) – Ryan Gates Dec 04 '14 at 16:43

5 Answers5

63

The FindAll method of the List<T> class actually constructs a new list object, and adds results to it. The Where extension method for IEnumerable<T> will simply iterate over an existing list and yield an enumeration of the matching results without creating or adding anything (other than the enumerator itself.)

Given a small set, the two would likely perform comparably. However, given a larger set, Where should outperform FindAll, as the new List created to contain the results will have to dynamically grow to contain additional results. Memory usage of FindAll will also start to grow exponentially as the number of matching results increases, where as Where should have constant minimal memory usage (in and of itself...excluding whatever you do with the results.)

jrista
  • 32,447
  • 15
  • 90
  • 130
  • 30
    The exception is where you actually do want to have a list afterwards (maybe you need to call `Count` or change members, or iterate through it more than once). While `Where()` beats `FindAll()`, `FindAll()` beats `Where().ToList()`. – Jon Hanna Sep 30 '10 at 09:41
  • 7
    @JonHanna: While initially I thought I agreed, I'm actually not sure. Do you have any references that indicate a .ToList() is slower than a .FindAll()? Calling .ToList() on a query would **be** the iteration of the enumerable, and therefor should maintain its memory efficiency. Not only that, certain internal implementations of the where iterator might even be able to create a list of exactly the right size (memory allocation) up front, outperforming the FindAll in such cases. I'm not specifically disagreeing, however it would be nice to have a solid reference that clarifies FindAlls benefit. – jrista Dec 21 '11 at 22:23
  • 1
    This answer is dead wrong. See @Wiory who bothered to actually measuring. – Carlo V. Dango Sep 27 '13 at 08:34
  • 2
    @Carlo: Sorry, but it is actually you who is wrong. Your comment on Wiory's answer seems to fail to notice the fact that he DID enumerate every approach via the "Check()" method...including the where->ienum approach. Wiory's results validate my answer...FindAll is SLOWER than using Where. Additionally, the various implementations of Where for different types of underlying collections are often optimized for the specific mechanism of the collection, offering even further performance boost (i.e. it is not all purely generic "where" behavior...it can be quite efficient!) – jrista Oct 02 '13 at 17:58
12

FindAll is obviously slower than Where, because it needs to create a new list.

Anyway, I think you really should consider Jon Hanna comment - you'll probably need to do some operations on your results and list would be more useful than IEnumerable in many cases.

I wrote small test, just paste it in Console App project. It measures time/ticks of: function execution, operations on results collection(to get perf. of 'real' usage, and to be sure that compiler won't optimize unused data etc. - I'm new to C# and don't know how it works yet,sorry).

Notice: every measured function except WhereIENumerable() creates new List of elements. I might be doing something wrong, but clearly iterating IEnumerable takes much more time than iterating list.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;

namespace Tests
{

    public class Dummy
    {
        public int Val;
        public Dummy(int val)
        {
            Val = val;
        }
    }
    public class WhereOrFindAll
    {
        const int ElCount = 20000000;
        const int FilterVal =1000;
        const int MaxVal = 2000;
        const bool CheckSum = true; // Checks sum of elements in list of resutls
        static List<Dummy> list = new List<Dummy>();
        public delegate void FuncToTest();

        public static long TestTicks(FuncToTest function, string msg)
        {
            Stopwatch watch = new Stopwatch();
            watch.Start();
            function();
            watch.Stop();
            Console.Write("\r\n"+msg + "\t ticks: " + (watch.ElapsedTicks));
            return watch.ElapsedTicks;
        }
        static void Check(List<Dummy> list)
        {
            if (!CheckSum) return;
            Stopwatch watch = new Stopwatch();
            watch.Start();

            long res=0;
            int count = list.Count;
            for (int i = 0; i < count; i++)     res += list[i].Val;
            for (int i = 0; i < count; i++)     res -= (long)(list[i].Val * 0.3);

            watch.Stop();
            Console.Write("\r\n\nCheck sum: " + res.ToString() + "\t iteration ticks: " + watch.ElapsedTicks);
        }
        static void Check(IEnumerable<Dummy> ieNumerable)
        {
            if (!CheckSum) return;
            Stopwatch watch = new Stopwatch();
            watch.Start();

            IEnumerator<Dummy> ieNumerator = ieNumerable.GetEnumerator();
            long res = 0;
            while (ieNumerator.MoveNext())  res += ieNumerator.Current.Val;
            ieNumerator=ieNumerable.GetEnumerator();
            while (ieNumerator.MoveNext())  res -= (long)(ieNumerator.Current.Val * 0.3);

            watch.Stop();
            Console.Write("\r\n\nCheck sum: " + res.ToString() + "\t iteration ticks :" + watch.ElapsedTicks);
        }
        static void Generate()
        {
            if (list.Count > 0)
                return;
            var rand = new Random();
            for (int i = 0; i < ElCount; i++)
                list.Add(new Dummy(rand.Next(MaxVal)));

        }
        static void For()
        {
            List<Dummy> resList = new List<Dummy>();
            int count = list.Count;
            for (int i = 0; i < count; i++)
            {
                if (list[i].Val < FilterVal)
                    resList.Add(list[i]);
            }      
            Check(resList);
        }
        static void Foreach()
        {
            List<Dummy> resList = new List<Dummy>();
            int count = list.Count;
            foreach (Dummy dummy in list)
            {
                if (dummy.Val < FilterVal)
                    resList.Add(dummy);
            }
            Check(resList);
        }
        static void WhereToList()
        {
            List<Dummy> resList = list.Where(x => x.Val < FilterVal).ToList<Dummy>();
            Check(resList);
        }
        static void WhereIEnumerable()
        {
            Stopwatch watch = new Stopwatch();
            IEnumerable<Dummy> iEnumerable = list.Where(x => x.Val < FilterVal);
            Check(iEnumerable);
        }
        static void FindAll()
        {
            List<Dummy> resList = list.FindAll(x => x.Val < FilterVal);
            Check(resList);
        }
        public static void Run()
        {
            Generate();
            long[] ticks = { 0, 0, 0, 0, 0 };
            for (int i = 0; i < 10; i++)
            {
                ticks[0] += TestTicks(For, "For \t\t");
                ticks[1] += TestTicks(Foreach, "Foreach \t");
                ticks[2] += TestTicks(WhereToList, "Where to list \t");
                ticks[3] += TestTicks(WhereIEnumerable, "Where Ienum \t");
                ticks[4] += TestTicks(FindAll, "FindAll \t");
                Console.Write("\r\n---------------");
            }
            for (int i = 0; i < 5; i++)
                Console.Write("\r\n"+ticks[i].ToString());
        }
    
    }

    class Program
    {
        static void Main(string[] args)
        {
            WhereOrFindAll.Run();
            Console.Read();
        }
    }
}

Results(ticks) - CheckSum enabled(some operations on results), mode: release without debugging(CTRL+F5):

 - 16,222,276 (for ->list)
 - 17,151,121 (foreach -> list)
 -  4,741,494 (where ->list)
 - 27,122,285 (where ->ienum)
 - 18,821,571 (findall ->list)

CheckSum disabled (not using returned list at all):

 - 10,885,004 (for ->list)
 - 11,221,888 (foreach ->list)
 - 18,688,433 (where ->list)
 -      1,075 (where ->ienum)
 - 13,720,243 (findall ->list)

Your results can be slightly different, to get real results you need more iterations.

EvilDr
  • 8,943
  • 14
  • 73
  • 133
Wiory
  • 201
  • 3
  • 5
  • your tests are fine. They show that the LINQ mechanism is slower than operating directly on the list. Not a surprise. Your "1075 (where ->ienum)" is wrong in that using a where without traversing the resulting elements will never actually perform a where! – Carlo V. Dango Sep 27 '13 at 08:32
  • 3
    Sorry Carlo, but he calls his "Check()" method even in the where->ienum implementatin. Check() iterates all of the collections, so his results ARE entirely valid. As a consequence, that also makes my answer correct...the answer which you called "dead wrong". – jrista Oct 02 '13 at 17:56
4

UPDATE(from comment): Looking through that code I agree, .Where should have, at worst, equal performance but almost always better.

Original answer:
.FindAll() should be faster, it takes advantage of already knowing the List's size and looping through the internal array with a simple for loop. .Where() has to fire up an enumerator (a sealed framework class called WhereIterator in this case) and do the same job in a less specific way.

Keep in mind though, that .Where() is enumerable, not actively creating a List in memory and filling it. It's more like a stream, so the memory use on something very large can have a significant difference. Also, you could start using the results in a parallel fashion much faster using there .Where() approach in 4.0.

Michael Freidgeim
  • 26,542
  • 16
  • 152
  • 170
Nick Craver
  • 623,446
  • 136
  • 1,297
  • 1,155
  • 1
    The WhereEnumerableIterator, rather than WhereIterator, is actually used unless you involve index in the where clause. WhereEnumerableIterator is considerably more efficient than WhereIterator. In the case of List, it incurs the cost of an extra method call (which should be inlined in release code), but does not need to dynamically resize an internal list as part of its processing. The efficiency of Where should outperform FindAll in all but the smallest lists (anything larger than 4 results will result in one or more resizings.) – jrista Feb 14 '10 at 05:46
  • In the case of calling Where on an Array or List, there are two additional internal iterator classes, WhereArrayIterator and WhereListIterator, which are optimized for those two cases. Generally speaking, calling Where should be more efficient than calling FindAll. – jrista Feb 14 '10 at 05:49
  • 4
    @jrista - I **completely** missed the case stack in the `.Where()` overload returning those, thanks! Looking through that code I agree, .Where should have, at worst, equal performance but almost always better. Also, SO would be useless if not for people taking the extra time to educate others, e.g. you and these comments, +1 for teaching me something. – Nick Craver Feb 14 '10 at 13:24
  • Glad I could be of service. :) – jrista Feb 14 '10 at 16:55
2

Where is much, much faster than FindAll. No matter how big the list is, Where takes exactly the same amount of time.

Of course Where just creates a query. It doesn't actually do anything, unlike FindAll which does create a list.

Nick Freeman
  • 1,411
  • 1
  • 12
  • 25
Jonathan Allen
  • 68,373
  • 70
  • 259
  • 447
  • 12
    This may be technically true, but I think it's pretty clear the OP is asking about performance in the context of actually enumerating the result, not the naked method call itself. – Dan Tao Jul 15 '10 at 12:36
-4

The answer from jrista makes senses. However, the new list adds the same objects, thus just growing with reference to existing objects, which should not be that slow. As long as 3.5 / Linq extension are possible, Where stays better anyway. FindAll makes much more sense when limited with 2.0

Eric
  • 1