52

I keep hearing that in .net 3.5 you should use IEnumerable over a List, but I can’t find any reference materials or articles that explain why it’s so much more proficient. Does anyone know of any content that explains this?

The purpose of asking this question is to get a better understanding of what IEnumerable is doing under the hood. If you can provide me with any links I will do the research and post an answer.

Yanis.F
  • 612
  • 6
  • 18
Zaffiro
  • 4,834
  • 5
  • 36
  • 47

7 Answers7

83

IEnumerable<T> is an interface that is implemented by List<T>. I suspect the reason you're hearing that IEnumerable<T> should be used is because it's a less constrictive interface requirement.

For example, consider the following method signature:

void Output(List<Foo> foos) 
{ 
    foreach(var foo in foos) { /* do something */ }
}

This method requires that it be passed a concrete implementation of a List. But it's just doing something in-order. It doesn't really need random access or any of the other things that a List<T> or even an IList<T> give it. Instead, the method should accept an IEnumerable<T>:

void Output(IEnumerable<Foo> foos) 
{ 
    foreach(var foo in foos) { /* do something */ }
}

Now we're using the most general (least specific) interface that supports the operations that we need. This is a fundamental aspect of OO-design. We've decreased coupling by requiring only what we need, and not a whole lot else besides. We've also created a more flexible method because the foos parameter might be a Queue<T>, a List<T>, anything that implements IEnumerable<T>. We aren't forcing the caller to convert their data structure to a List unnecessarily.

So it isn't that IEnumerable<T> is more efficient than list in a "performance" or "runtime" aspect. It's that IEnumerable<T> is a more efficient design construct because it's a more specific indication of what your design requires. (Though this can lead to runtime gains in specific cases.)

Greg D
  • 43,259
  • 14
  • 84
  • 117
  • 3
    This is all true, but you've missed the important point regarding the efficiency of deferred execution for methods that return `IEnumerable`. – jeremyalan Aug 28 '09 at 14:27
  • 2
    Though not all methods that return `IEnumerable` utilize deferred execution (presumably, this is why the original answer mentions that "this can lead to runtime gains in specific cases"). – Jeff Sternal Aug 28 '09 at 14:58
  • You can implement deferred execution in your own implementation of any interface, including `IList`, or in your own extension of any base class. – yfeldblum Aug 28 '09 at 15:06
  • 2
    Jeff is correct, that's why I put that there. In the context of the question, I consider any runtime gains to be icing on the cake. IE, a reward for following good design principles. :) – Greg D Aug 28 '09 at 16:09
52

Enumerables have several very nice properties that you lose when converting them to a list. Namely they:

  • Use Deferred/lazy Execution
  • Are Composable
  • Are Unbounded

First I'll look at deferred execution. Pop quiz: how many times will the following code iterate over the lines in the input file?

IEnumerable<string> ReadLines(string fileName)
{
    using (var rdr = new StreamReader(fileName) )
    {
       string line;
       while ( (line = rdr.ReadLine()) != null) yield return line;
    }
}


var SearchIDs = new int[] {1234,4321, 9802};

var lines = ReadLines("SomeFile.txt")
              .Where(l => l.Length > 10 && l.StartsWith("ID: "));
              .Select(l => int.Parse(l.Substring(4).Trim()));
              .Intersect(SearchIDs);

The answer is exactly one zero. It doesn't actually do any work until you iterate over the results. You need to add this code before it even opens the file:

foreach (string line in lines) Console.WriteLine(line);

Even after the code runs, it still only ever loops over the lines once. Compare that to how many times you need to iterate over the lines in this code:

var SearchIDs = new int[] {1234,4321, 9802};
var lines = File.ReadAllLines("SomeFile.txt"); //creates a list
lines = lines.Where(l => l.Length > 10 && l.StartsWith("ID: ")).ToList();
var ids = lines.Select(l => int.Parse(l.Substring(4).Trim())).ToList();
ids = ids.Intersect(SearchIDs).ToList();

foreach (string line in lines) Console.WriteLine(line);

Even if you ignore the File.ReadAllLines() call and use the same iterator block from the first sample, the first sample would still be faster. Of course, you could write it to be just as fast using lists, but to do requires tying the code that reads the file to the code that parses it. And so you lose another important feature: composability.

To demonstrate composability, I'll add one the final feature — unbounded series. Consider the folllowing:

IEnumerable<int> Fibonacci()
{
   int n1 = 1, n2 = 0, n;
   yield return 1;
   while (true)
   {
        n = n1 + n2;
        yield return n;
        n2 = n1;
        n1 = n;
   }
}

This looks like it would go forever, but you can use the composability properties of IEnumerable to build something that safely gives, say, the first 50 values, or every value that's less than a given number:

  foreach (int f in Fibonacci().Take(50)) { /* ... */ }
  foreach (int f in Fibonacci().TakeWhile(i => i < 1000000) { /* ... */ }

Finally, IEnumerable is just more flexible. Unless you absolutely need the ability to append to the list or access items by index, you're almost always better writing functions to accept IEnumerables as arguments instead of Lists. Why? Because you can still pass the list to the function if you want — A list is an IEnumerable. For that matter, so is an array and many other collection types are well. So by using an IEnumerable here, you take the exact same function and make it more powerful, because it can act on more different kinds of data.

Yahya Hussein
  • 8,767
  • 15
  • 58
  • 114
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
4

IEnumerable<T> is not more efficient than a List<T> as a List<T> is an IEnumerable<T>.

The IEnumerable<T> interface is simply .NET's way of using the iterator pattern, nothing more.

This interface can be implemented on many types (List<T> included) to allow those types to to return iterators (i.e. instances of IEnumerator<T>) so that the caller can iterate a sequence of items.

Andrew Hare
  • 344,730
  • 71
  • 640
  • 635
  • (-1) A minimalist IEnumerable would be smaller and faster than IList. Not by much, but List has overhead because it is more complicated. – DevinB Aug 28 '09 at 14:18
  • I disagree. As pointed out by other responses it's possible to return an `IEnumerable` implementation without creating an instance of anything, and which supports deferred execution for expensive operations. – jeremyalan Aug 28 '09 at 14:20
  • To clarify: I disagree with the answer, not with the previous comment. – jeremyalan Aug 28 '09 at 14:21
  • 1
    @devinb? Do you think it's worth a downvote? I thought Andrew was just using the classic "is a" technique to describe `List` as an implementor of `IEnumerable`. Or am I misunderstanding your comment? – Greg D Aug 28 '09 at 14:22
  • @ph0enix: I believe that you are referring to iterator blocks which _does_ involve creating and instance of a type. That type is compiler generated. Yes, you can also use deferred execution as well but none of these things have anything to do with `IEnumerable` other than they use it as an implementation detail. – Andrew Hare Aug 28 '09 at 14:29
  • @Greg: I feel that Andrew's answer does not answer the question in a useful way. The question is about efficiency and performance considerations, and Andrew's answer, although factually correct, glosses over those parts. – DevinB Aug 28 '09 at 14:39
  • 3
    @devinb: the question doesn't say anything about performance. "proficient" could refer to performance or correctness. Performance is likely one concern, but it's not the only concern. – Joel Coehoorn Aug 28 '09 at 14:58
3

It isn't a question of efficiency (although that may be true) but of flexibility.

Your code becomes more re-usable if it can consume an IEnumerable instead of a List. AS to efficient consider this code:-

 function IEnumerable<int> GetDigits()
 {

    for(int i = 0; i < 10; i++)
       yield return i
 }

 function int Sum(List<int> numbers)
 {
    int result = 0; 
    foreach(int i in numbers)
      result += i;

    return i;
 }

Q: How do I take the set of numbers generated by GetDigits and get Sum to add them up?
A: I need to load the set of numbers from GetDigits into a List object and pass that to the Sum function. This uses memory as all the digits need to be loaded into memory first before they can be summed. However changing the signature of Sum to:-

 function int Sum(IEnumerable<int> numbers)

Means I can do this:-

 int sumOfDigits = Sum(GetDigits());

No list is loaded into memory I only need storage for the current digit and the accumulator variable in sum.

AnthonyWJones
  • 187,081
  • 35
  • 232
  • 306
1

These are two different beasts and you cannot really compare them. For instance, in var q = from x in ... the q is IEnumerable, but under the hood it executes a very expensive database call.

An IEnumerable is just an interface for an Iterator design pattern, whereas List/IList is a data container.

Anton Gogolev
  • 113,561
  • 39
  • 200
  • 288
1

One reason that it is recommended to have methods return IEnumerable<T> is that it is less specific than List<T>. This means that you can later change the internals of your method to use something that may be more efficient for the needs of it, and as long as it is an IEnumerable<T> you don't need to touch the contract of your method.

Fredrik Mörk
  • 155,851
  • 29
  • 291
  • 343
0

In .NET 3.5, using IEnumerable allows you to write methods with deferred execution, such as the following:

public class MyClass
{
   private List<int> _listOne;
   private List<int> _listTwo;
public IEnumerable<int> GetItems () { foreach (int n in _listOne) { yield return n; } foreach (int n in _listTwo) { yield return n; } } }

This allows you to combine the two lists without creating a new List<int> object.

jeremyalan
  • 4,658
  • 2
  • 29
  • 38
  • You aren't creating a new `List` but you _are_ allowing the compiler to create a state machine on your behalf to make this all work. – Andrew Hare Aug 28 '09 at 14:47