14

I noticed that List<T> defines its enumerator as a struct, while ArrayList defines its enumerator as a class. What's the difference? If I am to write an enumerator for my class, which one would be preferable?

EDIT: My requirements cannot be fulfilled using yield, so I'm implementing an enumerator of my own. That said, I wonder whether it would be better to follow the lines of List<T> and implement it as a struct.

oguz ismail
  • 1
  • 16
  • 47
  • 69
Hosam Aly
  • 41,555
  • 36
  • 141
  • 182
  • 1
    I would be extremely interested to know where you hit a problem using yield return. – Daniel Earwicker Dec 21 '08 at 14:55
  • 1) I very much hate "compiler magic" for which I'm not sure what the output would be. (Although I would utilize it for simple situations.) – Hosam Aly Dec 21 '08 at 15:23
  • 1
    2) I want to support a list that can be modified during iteration. Normal iterator semantics forbid that. (And I don't know how the compiler would know that my list was modified!) – Hosam Aly Dec 21 '08 at 15:25
  • @Earwicker: I tried using yield and disassembled the result with Reflector. It seems to me that my iterator was better implemented than the one the compiler generated (mine was faster, cost less memory and supported the Reset method). – Hosam Aly Dec 21 '08 at 15:48
  • 1
    A few harmless questions then: Wouldn't it be better to learn what the compiler magic does, so you can benefit from it? When say your implementation was faster and cost less memory, did you measure that in a realistic test? And how often do you use Reset? – Daniel Earwicker Dec 21 '08 at 16:52
  • @Earwicker: Thanks for the suggestion. I did learn what the compiler does by disassembling it. Regarding speed and memory, I based them on the variables and instructions in the disassembly, but I didn't actually run any benchmarks. As for the Reset() method, it was just for the sake of completeness. – Hosam Aly Dec 22 '08 at 08:46
  • 1
    For the `List` part, catch Eric Lippert's explanation in this question: [why-bcl-collections-use-struct-enumerators-not-classes?](http://stackoverflow.com/questions/3168311/why-bcl-collections-use-struct-enumerators-not-classes?lq=1) – nawfal Dec 01 '13 at 03:40

8 Answers8

11

The easiest way to write an enumerator in C# is with the "yield return" pattern. For example.

public IEnumerator<int> Example() {
  yield return 1;
  yield return 2;
}

This pattern will generate all of the enumerator code under the hood. This takes the decision out of your hands.

JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
10

Like this others, I would choose a class. Mutable structs are nasty. (And as Jared suggests, I'd use an iterator block. Hand-coding an enumerator is fiddly to get right.)

See this thread for an example of the list enumerator being a mutable struct causing problems...

Mike Christiansen
  • 1,104
  • 2
  • 13
  • 30
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Well, that example proved I should make it a class. :) But why is List implemented that way? Why was it not corrected? – Hosam Aly Dec 21 '08 at 15:07
  • It's certainly too late to change it now - but I don't know why it was designed that way in the first place. Presumably in the name of efficiency - but a bad call, IMO. – Jon Skeet Dec 21 '08 at 15:15
  • 1
    I presumed it was so the current state of an enumerator could be 'saved' just by copying the enumerator to another variable - I've recently used this 'feature' myself to do linked list slices (as LinkedList.Enumerator is also a struct) – thecoop Nov 13 '09 at 10:24
  • 4
    The link to the example is dead, and the Way Back Machine doesn't help either. – Cristian Diaconescu Jan 29 '13 at 14:07
  • @JonSkeet: I would guess that if languages had been designed so that they would look for a method called something like `DuckTypeGetEnumerator` before checking for a method called `GetEnumerator`, then `List.DuckTypeGetEnumerator` would have returned a struct and `List.GetEnumerator` would have returned a class. There are only upsides to having `foreach` use a struct internally, and only downsides to having `IEnumerator.GetEnumerator` return one; having the explicitly-implemented interface return a class while the class method returns a struct would have been odd, though. – supercat Jan 18 '14 at 16:25
  • Link in the answer is broken, and I'm guessing [this](http://aspnet-answers.com/microsoft/Csharp/31702392/c-compiler-challenge--see-if-you-can-answer-why.aspx) is where it was moved to. My apologies to any future visitors if/when this link is broken as well. – Tinister Oct 02 '15 at 08:03
  • @Tinister: Thanks - fixed the link. – Jon Skeet Oct 02 '15 at 08:07
  • @JonSkeet then why iterator blocks are translated by the compiler into value types? do you still think is a bad option? – codingadventures Feb 15 '16 at 01:48
  • @GiovanniCampo: They're not. Iterator blocks are translated into classes, as far as I can tell. Async methods are translated into mutable value types, for efficiency in simple cases. – Jon Skeet Feb 15 '16 at 07:47
10

Reason List uses a struct enumerator is to prevent garbage generation in foreach statements. This is pretty good reason especially if you are programming for Compact Framework, because CF doesn't have generational GC and CF is usually used on low performance hardware where it can quickly lead to performance issues.

Also, I don't think mutable structs are source of problems in examples some posted, but programmers that don't have good understanding of how value types work.

zigzag
  • 579
  • 5
  • 17
  • I can't see how a `struct` enumerator would prevent garbage generation. Wouldn't you be even more likely to end up with several copies of the same enumerator, as opposed to a single enumerator object if it were a `class`? Or do value types not need to be garbage-collected? – stakx - no longer contributing Oct 30 '11 at 08:48
  • Value types are not garbage-collected in the same way as reference types. Reference types live on the heap, while value types live on the stack. So while reference types are deallocated by compacting the GC heap (can be slow if you have large, complex heap); value types are deallocated by popping the stack (very fast). – zigzag Jul 18 '13 at 08:22
  • Usually you don't need to worry about garbage generation because GC will be fast enough and it will not be performance bottleneck. But in special cases like high frequency code, a lot of allocation on the heap can cause performance problems by causing GC to run more frequently than usual. (for example allocating a lot of new objects each frame in a game) – zigzag Jul 18 '13 at 08:33
5

Write it using yield return.

As to why you might otherwise choose between class or struct, if you make it a struct then it gets boxed as soon as it is returned as an interface, so making it a struct just causes additional copying to take place. Can't see the point of that!

Daniel Earwicker
  • 114,894
  • 38
  • 205
  • 284
5

An enumerator is inherently a changing structure, since it needs to update internal state to move on to the next value in the original collection.

In my opinion, structs should be immutable, so I would use a class.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
4

Any implementation of IEnumerable<T> should return a class. It may be useful for performance reasons to have a GetEnumerator method which returns a struct which provides the methods necessary for enumeration but does not implement IEnumerator<T>; this method should be different from IEnumerable<T>.GetEnumerator, which should then be implemented explicitly.

Using this approach will allow for enhanced performance when the class is enumerated using a foreach or "For Each" loop in C# or vb.net or any context where the code which is doing the enumeration will know that the enumerator is a struct, but avoid the pitfalls that would otherwise occur when the enumerator gets boxed and passed by value.

Hosam Aly
  • 41,555
  • 36
  • 141
  • 182
supercat
  • 77,689
  • 9
  • 166
  • 211
  • 1
    It's worth noting that this level of performance gain is rarely important in most everyday applications. Devs should use a profiler to discover the real trouble spots before diving into a struct enumerator. Still, it's a valid technique, and I've used it a couple of times to good effect. – Jeff Sharp Jan 18 '14 at 15:16
  • `IEnumerator` is [implemented as a struct](https://referencesource.microsoft.com/#PresentationCore/Core/CSharp/System/Windows/Media/Generated/PathSegmentCollection.cs,aa427cb4a41bf9d5) in WPF's `PathFigureCollection`. Is this a concrete case when it is *useful for performance reasons to have a GetEnumerator method which returns a struct*? – zwcloud Apr 21 '19 at 12:13
  • 1
    @zwcloud: The presence of a public `GetEnumerator()` method which returns a structure will be useful for performance reasons when used in conjunction with a `foreach` loop. The use of a struct rather than a class to implement `IEnumerator.GetEnumerator()` will generally be relatively harmless. I am unaware of any cases where having the `GetEnumerator()` function that returns a structure be the implementation of `IEnumerator.GetEnumerator()` would be better for performance reasons than having the latter function return a class; more likely, the performance downsides were minor enough... – supercat Apr 21 '19 at 15:02
  • ...that MS decided to use one method (and enumerator type) for both purposes. – supercat Apr 21 '19 at 15:02
1

There's a couple of blog posts that cover exactly this issue. Basically, enumerator structs are a really really bad idea...

thecoop
  • 45,220
  • 19
  • 132
  • 189
  • There would be nothing but upside to having the thing used by `foreach` be a struct, if there were a clean way of having it not be the same type as is returned by `IEnumerable.GetEnumerator`. Interestingly, were it not for variable-type inference, it wouldn't matter much whether `GetEnumerator` returned a struct or class, since assigning the return value to `IEnumerator` effectively turns it into a class object (albeit one with a somewhat-broken `Equals` method). The problem is with `var myEnumerator=thing.GetEnumerator();`. – supercat Jan 18 '14 at 16:30
1

To expand on @Earwicker: you're usually better off not writing an enumerator type, and instead using yield return to have the compiler write it for you. This is because there are a number of important subtleties that you might miss if you do it yourself.

See SO question "What is the yield keyword used for in C#?" for some more details on how to use it.

Also Raymond Chen has a series of blog posts ("The implementation of iterators in C# and its consequences": parts 1, 2, 3, and 4) that show you how to implement an iterator properly without yield return, which shows just how complex it is, and why you should just use yield return.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Jay Bazuzi
  • 45,157
  • 15
  • 111
  • 168