3

Why is it that in CIL, the compiler converts a foreach loop into a for loop when an array is used, but it uses the iterator pattern when a List<T> is used?

If both System.Array and System.Collections.Generic.List<T> implement IEnumerable, shouldn't they both use the iterator pattern behind the scenes?

Here is an example:

Console App1:

C#:

class Program
{
    static void Main(string[] args)
    {
        var enumerable = new List<string> { "a", "b" };

        foreach (string item in enumerable)
        {
            string x = item;
        }
    }
}

CIL:

.method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 80 (0x50)
    .maxstack 3
    .entrypoint
    .locals init (
        [0] class [mscorlib]System.Collections.Generic.List`1<string> enumerable,
        [1] valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<string>,
        [2] string item,
        [3] string x
    )

    IL_0000: nop
    IL_0001: newobj instance void class [mscorlib]System.Collections.Generic.List`1<string>::.ctor()
    IL_0006: dup
    IL_0007: ldstr "a"
    IL_000c: callvirt instance void class [mscorlib]System.Collections.Generic.List`1<string>::Add(!0)
    IL_0011: nop
    IL_0012: dup
    IL_0013: ldstr "b"
    IL_0018: callvirt instance void class [mscorlib]System.Collections.Generic.List`1<string>::Add(!0)
    IL_001d: nop
    IL_001e: stloc.0
    IL_001f: nop
    IL_0020: ldloc.0
    IL_0021: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<string>::GetEnumerator()
    IL_0026: stloc.1
    .try
    {
        IL_0027: br.s IL_0035
        // loop start (head: IL_0035)
            IL_0029: ldloca.s 1
            IL_002b: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<string>::get_Current()
            IL_0030: stloc.2
            IL_0031: nop
            IL_0032: ldloc.2
            IL_0033: stloc.3
            IL_0034: nop

            IL_0035: ldloca.s 1
            IL_0037: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<string>::MoveNext()
            IL_003c: brtrue.s IL_0029
        // end loop

        IL_003e: leave.s IL_004f
    } // end .try
    finally
    {
        IL_0040: ldloca.s 1
        IL_0042: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<string>
        IL_0048: callvirt instance void [mscorlib]System.IDisposable::Dispose()
        IL_004d: nop
        IL_004e: endfinally
    } // end handler

    IL_004f: ret
} // end of method Program::Main

Console App2:

C#:

class Program
{
    static void Main(string[] args)
    {
        var enumerable = new string[] { "a", "b" };

        foreach (string item in enumerable)
        {
            string x = item;
        }
    }
}

CIL:

.method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 51 (0x33)
    .maxstack 4
    .entrypoint
    .locals init (
        [0] string[] enumerable,
        [1] string[],
        [2] int32,
        [3] string item,
        [4] string x
    )

    IL_0000: nop
    IL_0001: ldc.i4.2
    IL_0002: newarr [mscorlib]System.String
    IL_0007: dup
    IL_0008: ldc.i4.0
    IL_0009: ldstr "a"
    IL_000e: stelem.ref
    IL_000f: dup
    IL_0010: ldc.i4.1
    IL_0011: ldstr "b"
    IL_0016: stelem.ref
    IL_0017: stloc.0
    IL_0018: nop
    IL_0019: ldloc.0
    IL_001a: stloc.1
    IL_001b: ldc.i4.0
    IL_001c: stloc.2
    IL_001d: br.s IL_002c
    // loop start (head: IL_002c)
        IL_001f: ldloc.1
        IL_0020: ldloc.2
        IL_0021: ldelem.ref
        IL_0022: stloc.3
        IL_0023: nop
        IL_0024: ldloc.3
        IL_0025: stloc.s x
        IL_0027: nop
        IL_0028: ldloc.2
        IL_0029: ldc.i4.1
        IL_002a: add
        IL_002b: stloc.2

        IL_002c: ldloc.2
        IL_002d: ldloc.1
        IL_002e: ldlen
        IL_002f: conv.i4
        IL_0030: blt.s IL_001f
    // end loop

    IL_0032: ret
} // end of method Program::Main
David Klempfner
  • 8,700
  • 20
  • 73
  • 153
  • Are you using dnSpy? It gives you tooltips that explain each IL call. Makes sense an array wouldn't need an iterator, the way its a contiguous memory allocation. It's an obvious optimisation to do it this way. – Jeremy Thompson Sep 15 '19 at 03:00
  • @JeremyThompson But a List is just an array behind the scenes as well though. – David Klempfner Sep 15 '19 at 03:07
  • If the compiler knows that the IEnumerable is an array, it would make sense that it optimize a foreach into a for; it is so much simpler (/faster) to iterate over an array by index than by any other MoveNext/Current algorithm I can think of. – Flydog57 Sep 15 '19 at 03:07
  • The difference is that with arrays no object is allocated to manage the iteration and bounds checking is removed. With Lists the iteration management variable is stack allocated and bounds checking is performed. So it's clear why the language designers used the For loop (changing ForEach in the IL output) with arrays when iterating. – Jeremy Thompson Sep 15 '19 at 03:27
  • 1
    @JeremyThompson I understand now. Eric Lippert puts it quite well here:https://stackoverflow.com/questions/7350495/whats-going-on-behind-the-scene-of-the-foreach-loop "This need not be the code that is generated; all that is required is that we generate code that produces the same result. For example, if you "foreach" over an array or a string, we just generate a "for" loop" – David Klempfner Sep 15 '19 at 03:30
  • @MickyD but System.Array implements IEnumerable. This page says Array uses GetEnumerator(): https://learn.microsoft.com/en-us/dotnet/api/system.array.getenumerator?view=netframework-4.8 – David Klempfner Sep 15 '19 at 04:44
  • 1
    @backwards_dave: that page says that Array implements `GetEnumerator`. It also say *using `foreach` is recommended, instead of directly manipulating the enumerator.* Of course, you can call GetEnumerator and then MoveNext/Current on the result. But it doesn't stop the compiler from doing something else equivalent under the covers – Flydog57 Sep 15 '19 at 12:54

1 Answers1

2

The difference is that with arrays no object is allocated to manage the iteration and bounds checking is removed. With Lists the iteration management variable is stack allocated and bounds checking is performed. So it's clear why the language designers used the For loop (changing ForEach in the IL output) with arrays when iterating.

Since an array doesn't support Adding/Removing items there's an implied fixed Length. So without bounds checking it's an optimisation to access array items by index rather than Iterator (IEnumerable implementation).

Jeremy Thompson
  • 61,933
  • 36
  • 195
  • 321