4

Correct me if im wrong but while doing a foreach an IEnumerable<T> creates garbage no matter what T is. But I'm wondering if you have a List<T> where T is Entity. Then say there is a derived class in the list like Entity2D. Will it have to create a new enumerator for each derived class? Therefore creating garbage?

Also does having an interface let's say IEntity as T create garbage?

ares_games
  • 1,019
  • 2
  • 15
  • 32
Chris Watts
  • 2,284
  • 4
  • 23
  • 28
  • What do you mean ? Iterating an IEnumerable does not create "garbage". – driis Jul 15 '10 at 19:19
  • 6
    Lol, what did you smoke? Also define "garbage". foreach with a list has worked for me from the very beginning. – Gabriel Magana Jul 15 '10 at 19:20
  • 1
    @driis it creates new instance of IEnumerator. this can be considered garbage – Andrey Jul 15 '10 at 19:22
  • 3
    @Andrey: The IEnumerator generated by list is a struct, though - so it doesn't add any GC pressure, since it's saved on the stack (unless, of course, you stick it into a class yourself, but that would be very, very weird...) – Reed Copsey Jul 15 '10 at 19:25
  • @Reed: It wouldn't be *that* weird. Take a look at [this question](http://stackoverflow.com/questions/2828203/grouping-consecutive-identical-items-ienumerablet-to-ienumerableienumerablet), for example. – Dan Tao Jul 15 '10 at 20:15
  • @Dan: That's a fairly edge case, strange example, IMO ;) – Reed Copsey Jul 15 '10 at 20:29
  • @Reed: Yeah, true... but honestly, after that question it started seeming to me like not such a bad idea to wrap `IEnumerator` for other purposes when you have specialized problems like that (or, to put it another way, realizing that solving the problem was easy with the assistance of an expanded `IEnumerator` implementation was an "a ha" moment -- for me, anyway). So, edge case: yes; but not something I would dismiss as "very, very weird" without also conceding that it might *occasionally* make sense. – Dan Tao Jul 15 '10 at 20:38
  • @Dan: Oh, it does make sense - though I'd wrap it into a struct again, not a class, in which case, my original statement would still apply ;) – Reed Copsey Jul 15 '10 at 20:51
  • @Reed: Well, I can understand why the designers of `List` made `List.Enumerator` a `struct` as a performance optimization; but in general, implementing `IEnumerator` in a value type seems -- to *me* at least -- very risky, since `IEnumerator` by its very design defines a mutable type. Mutable value types tend to cause all kinds of bugs because they conflict so badly with developers' intuitions. So if I'm making some generic implementation of `IEnumerator`, I would *heavily* lean towards making it a class, personally. – Dan Tao Jul 15 '10 at 21:06

5 Answers5

22

List<T>'s GetEnumerator method actually is quite efficient.

When you loop through the elements of a List<T>, it calls GetEnumerator. This, in turn, generates an internal struct which holds a reference to the original list, an index, and a version ID to track for changes in the list.

However, since a struct is being used, it's really not creating "garbage" that the GC will ever deal with.


As for "create a new enumerator for each derived class" - .NET generics works differently than C++ templates. In .NET, the List<T> class (and it's internal Enumerator<T> struct) is defined one time, and usable for any T. When used, a generic type for that specific type of T is required, but this is only the type information for that newly created type, and quite small in general. This differs from C++ templates, for example, where each type used is created at compile time, and "built in" to the executable.

In .NET, the executable specifies the definition for List<T>, not List<int>, List<Entity2D>, etc...

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • 2
    +1. They also differ from Java's generics, which are purely a compile-time feature (all occurrences of type `T` are just regarded as `object`) – Adam Robinson Jul 15 '10 at 19:29
7

I think you may be interested in this article which explains why List(T) will not create "garbage", as opposed to Collection(T):

Now, here comes the tricky part. Rumor has it that many of the types in System.Collections.Generic will not allocate an enumerator when using foreach. List's GetEnumerator, for example, returns a struct, which will just sit on the stack. Look for yourself with .NET Reflector, if you don't believe me. To prove to myself that a foreach over a List doesn't cause any heap allocations, I changed entities to be a List, did the exact same foreach loop, and ran the profiler. No enumerator!

[...]

However, there is definitely a caveat to the above. Foreach loops over Lists can still generate garbage. [Casting List to IEnumerable] Even though we're still doing a foreach over a List, when the list is cast to an interface, the value type enumerator must be boxed, and placed on the heap.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Michael Stum
  • 177,530
  • 117
  • 400
  • 535
  • Thanks I saw that article but I didn't think it was specific about having interfaces as T – Chris Watts Jul 15 '10 at 20:02
  • Good point, it only doesn't generate garbage when use the list directly. When you go through the interface it create garbage. – Gamlor Jul 16 '10 at 11:48
3

An interesting note: as Reed Copsey pointed out, the List<T>.Enumerator type is actually a struct. This is both good and horrible.

It's good in the sense that calling foreach on a List<T> actually doesn't create garbage, as no new reference type objects are allocated for the garbage collector to worry about.

It's horrible in the sense that suddenly the return value of GetEnumerator is a value type, against almost every .NET developer's intuition (it is generally expected that GetEnumerator will return a non-descript IEnumerator<T>, as this is what is guaranteed by the IEnumerable<T> contract; List<T> gets around this by explicitly implementing IEnumerable<T>.GetEnumerator and publicly exposing a more specific implementation of IEnumerator<T> which happens to be a value type).

So any code that, for example, passes a List<T>.Enumerator to a separate method which in turn calls MoveNext on that enumerator object, faces the potential issue of an infinite loop. Like this:

int CountListMembers<T>(List<T> list)
{
    using (var e = list.GetEnumerator())
    {
        int count = 0;
        while (IncrementEnumerator(e, ref count)) { }

        return count;
    }
}

bool IncrementEnumerator<T>(IEnumerator<T> enumerator, ref int count)
{
    if (enumerator.MoveNext())
    {
        ++count;
        return true;
    }

    return false;
}

The above code is very stupid; it's only meant as a trivial example of one scenario in which the return value of List<T>.GetEnumerator can cause highly unintuitive (and potentially disastrous) behavior.

But as I said, it's still kind of good in that it doesn't create garbage ;)

Community
  • 1
  • 1
Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • +1, though it's also worth noting that the usage of `var` is what kills you, since it types the variable as `List.Enumerator`. If you were to use `IEnumerator`, you wouldn't have this issue, as you'd be passing around a single boxed reference to the original enumerator. – Adam Robinson Jul 15 '10 at 19:57
  • It should be noted that using value type there creates very good optimization opportunities for JIT. If you look at the native code produced, you'll see that it will inline pretty much everything, resulting in a simple loop over array indices; the only extra over an explicit loop you'd write yourself is the "version" check on every iteration to detect modifications in collection. – Pavel Minaev Jul 15 '10 at 20:01
  • @Adam Robinson: Yeah; I guess that's kind of what I mean, though, about developers' expectations for `GetEnumerator`. Plenty of developers would not think twice about using `var` because it's always expected for `GetEnumerator` to return something typed as `IEnumerator`. I'd also note that the issue here could be handled equally well (in fact I'd say better) by having the method accept an `IEnumerable` instead of a `List`. – Dan Tao Jul 15 '10 at 20:05
  • Yeah, I don't mean to diminish your point (that the developer would likely use `var`), but just wanted to bring that fact to the forefront. And, you're right, the method should have accepted an `IEnumerable` anyway. – Adam Robinson Jul 15 '10 at 20:10
  • A related posting: http://stackoverflow.com/questions/2939024/c-4-0-dynamic-and-foreach-statement/2939180#2939180 – Eric Lippert Jul 15 '10 at 20:35
  • It's too bad that there isn't a way of either (1) having a type specify that using `foreach` on it should call a method called something other than `GetEnumerator` which could return a value type while `GetEnumerator` would return a class type, or (2) having an attribute which could either specify a "inference" type for a method's return value or forbid type inference on it (so `var e = someList.GetEnumerator` would infer the type of `e` as `IEnumerator` even though `GetEnumerator` returned a `List.Enumerator`. – supercat Feb 04 '13 at 20:15
1

Regardless of whether it's a List<Entity>, List<Entity2D>, or List<IEntity>, GetEnumerator will be called once per foreach. Further, it is irrelevant whether e.g. List<Entity> contains instances of Entity2D. An IEnumerable<T>'s implementation of GetEnumerator may create reference objects which will be collected. As Reed noted, List<T> in MS .NET avoids this by using only value types.

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
0

The class List<T> implements IEnumerator<T> explicitly, so that calling GetEnumerator on a variable of type List<T> will cause it to return a List<T>.Enumerator, which has value-type semantics, whereas calling it on a variable of type IEnumerator<T> which holds a reference to a List<T> will cause it to return a value of type IEnumerator<T>, which will have reference semantics.

supercat
  • 77,689
  • 9
  • 166
  • 211