4

In the framework classes of collections I have often seen IEnumerator<T> separately implemented as an inner class and an instance of it is returned in the GetEnumerator method.

Now suppose I'm writing my own collection classes which will have an inbuilt collection like List<T> or T[] act as the holder internally, say like this:

public class SpecialCollection<T> : IEnumerable<T>
{
    List<T> list;

    public SpecialCollection<T>()
    {

    }

    public IEnumerator<T> GetEnumerator()
    {
        return list.GetEnumerator();

        //or
        return list.Where(x => some logic).GetEnumerator(); 

        //or directly rely on yield keyword
        yield return x; //etc
    }
}

Should I be writing my own enumerator class, or is it ok to return enumerator of the List<T> class? Is there any circumstance under which should I be writing my own enumerator class?


I have a related question as well. If it's not all that important or doesn't make much of a difference, why do every collection class in the BCL write their own IEnumerator?

For eg, List<T> class has something like

T[] items;
public IEnumerator<T> GetEnumerator()
{
    return new List<T>.Enumerator(items);
}
nawfal
  • 70,104
  • 56
  • 326
  • 368
  • 1
    As opposed to what? Many of those classes were written before iterators existed. – SLaks Jun 10 '14 at 13:43
  • 1
    Generics were introduced at the same time as the yield keyword, if I'm not mistaken – Dennis_E Jun 10 '14 at 13:44
  • Don't use the wasteful `array.Select(x => x).GetEnumerator();`, you can use `((IEnumerable)array).GetEnumerator()`. One-dimensional arrays implement the generic interfaces in a "magical" way that looks like explicit interface implementation from the outside. – Jeppe Stig Nielsen Jun 10 '14 at 13:47
  • Many of those enumerators are structs. Perhaps that has something to do with it. – Dennis_E Jun 10 '14 at 13:47
  • @JeppeStigNielsen I'm not sure I get your point. That was some example. – nawfal Jun 10 '14 at 13:49
  • 1
    @JeppeStigNielsen I think the `// some logic` was meant that some additional logic can be added to the enumerator. – Maarten Jun 10 '14 at 13:49
  • Isn't a duplicate, but may clear up some points: http://stackoverflow.com/questions/3737997/why-implement-ienumerablet-if-i-can-just-define-one-getenumerator – Yuval Itzchakov Jun 10 '14 at 13:50
  • I can understand if this q was a duplicate, or vague. Opinion based? What's wrong? – nawfal Jun 10 '14 at 13:50
  • See if http://ericlippert.com/2011/06/30/following-the-pattern/ helps you – Jon Skeet Jun 10 '14 at 13:50
  • 1
    @Maarten Suppose I write my own type `class MyColl : IEnumerable`. I could hold my values in an array internally, so have a field `private int[] array;`. Now, when I need to implement the generic interface, I can't just say `return array.GetEnumerator();` (compile-time error) because of the not-quite-generic nature of arrays (here `int[]`). I *could* solve that by `return array.Select(x => x).GetEnumerator();`. But that is wasteful. Instead use `return ((IEnumerable)array).GetEnumerator();` or equivalently `return array.AsEnumerable().GetEnumerator();`. See what I mean? – Jeppe Stig Nielsen Jun 10 '14 at 13:58
  • @JeppeStigNielsen good point. Just to reiterate, the examples weren't carefully crafted ones. I only wanted to convey the intent. – nawfal Jun 10 '14 at 14:11
  • @JonSkeet Thanks. But that link talks about what forced the need for duck typing in case of `foreach` and why struct is preferred over a reference type. It doesn't talk anything about re-using the existing enumerator implementations. – nawfal Jun 10 '14 at 14:46
  • @nawfal That assumes you have an existing enumerator implementation to reuse. If you do, great, if you don't, then you have to do something else. At this post an iterator block will cover most everyone, that it, everyone but those with the strictest of performance requirements. – Servy Jun 10 '14 at 14:49
  • @nawfal: It talks about why `List` has its own (struct) enumerator type. – Jon Skeet Jun 10 '14 at 14:53
  • @Servy yes, that is what my question is about. That is, if there already exists enumerator implementations to re-use, is there any point in writing new ones. I asked this because I see it all the time in framework classes. But as Eric answers, performance is the only reason. – nawfal Jun 10 '14 at 14:56
  • @JonSkeet from what I understand, it doesn't talk about *why List has its **own** (struct) enumerator type*. It talks about *why List has its own **(struct)** enumerator type.* – nawfal Jun 10 '14 at 14:57
  • @nawfal: I'm not at all sure what difference you're trying to specify there, unless it's in terms of the highlighting. Given that the enumerator needs access to the list's private members, it has to be a nested struct... how would you suggest implementing it otherwise? – Jon Skeet Jun 10 '14 at 14:58
  • @JonSkeet The article doesn't talk about *not using the `GetEnumerator` of the backing array* (which is what my question is about). It talks about why it is required to be struct in the first place. My q is, *if you require a struct, then why not re-use the already implemented struct of backing array*. I hope its clear. Not to be nitpicking, but honestly the link doesnt clear my confusion at all. Eric's answer does. – nawfal Jun 10 '14 at 15:02
  • @nawfal It seems you're missing the fact that a `foreach` over an array is recognized by the compiler and transformed into a `for` loop instead, as that's much faster. This means that only boxed array enumerators are actually used, and the act of boxing also eliminates the possibility of the list-style optimizations as well, so there's no need for arrays to use the list's trick of having a struct as an enumerator (so they don't; their enumerator is a `class`), because they have something *even better* that's specific to them. – Servy Jun 10 '14 at 15:03
  • 3
    @nawfal: If you return the `IEnumerator` of the array, you wouldn't be able to detect the list changing between calls, which should trigger an exception. – Jon Skeet Jun 10 '14 at 15:06
  • @JonSkeet That's an excellent point and answers my question!. If you could make it an answer I will appreciate it (vcsjones has answered it, but for some reason he has deleted it). But I also do think utilizing enumerators of anything above array (like `List`, `Queue`) in the framework is not a bad choice considering they do catch modifications. – nawfal Jun 10 '14 at 15:38

2 Answers2

7

The answer to your first question is: when a yield return doesn't meet your needs.

The answer to your second question is: these heavily used types have performance requirements that are unusually strict, so the enumerators are custom built. I've written some articles on this recently; see:

http://ericlippert.com/2014/05/21/enumerator-advance/

http://ericlippert.com/2014/06/04/enumerator-bounds/

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
2

Just to answer one part:

List<T> has its own enumerator implementation for two reasons:

  • It can't just return the iterator of its backing array, because:
    • It needs to detect structural changes in the list (additions and removals) in order to invalidate the enumerator
    • The array may well be larger than the list. (Using Take would fix this, at the cost of another level of indirection)
  • The above could be performed with an iterator block (although there'd have to be a non-iterator method first, in order to capture the "structural version" of the list at call time rather than first iteration time), but that's relatively inefficient compared with the actual highly-optimized mutable struct implementation

Using a mutable struct here has certain issues, but when used in the expected fashion, it avoids heap allocations, virtual method calls via references etc.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • The array length is another excellent point, easy to miss. But I do think returning the enumerator of built in types from the framework like dictionary, list, stack, queue, set etc (other than array of course) is ok, considering their enumerator handles this collection modification issue. Isn't it? – nawfal Jun 10 '14 at 15:52
  • @nawfal: Returning the enumerator in what context? It's often okay, but sometimes it isn't. There's no one-size-fits-all answer here. – Jon Skeet Jun 10 '14 at 16:08
  • I meant in the same case as in the example I have provided in my question. If the backing collection type of my class `SpecialCollection` is `List`, then returning `List.GetEnumerator` from the `SpecialCollection.GetEnumerator` method is error free, I guess. – nawfal Jun 10 '14 at 16:24
  • 1
    @nawfal: Yes, if you don't have any other particular requirements, I think that should be okay. – Jon Skeet Jun 10 '14 at 16:32