6

I ran the following console application:

class Program
{
    static void Main(string[] args)
    {
        int n = 10000;

        Stopwatch s = new Stopwatch();
        s.Start();
        List<int> numbers = GetListNumber(n);
        foreach (var number in numbers)
        {

        }
        s.Stop();
        Console.WriteLine(s.Elapsed);
        Console.WriteLine();

        s.Restart();
        foreach (var number in GetEnumerator(n))
        {

        }
        s.Stop();
        Console.WriteLine(s.Elapsed);
        Console.ReadKey();
    }

    static List<int> GetListNumber(int n)
    {
        List<int> numbers = new List<int>();
        for (int i = 0; i < n; i++)
            numbers.Add(i);
        return numbers;
    }

    static IEnumerable<int> GetEnumerator(int n)
    {
        for (int i = 0; i < n; i++)
            yield return i;
    }
}

in order to compare the time we need to iterate through the elements of a collection and if it was better to build this collection using a List or an IEnumerable. To my surprise, the result it was 00:00:00.0005504 for the List and 00:00:00.0016900 for the IEnumerable. I was expecting that the second way, IEnumerable, it would be faster because the values are created on the fly and we don't have to add them each one of them one item at the time, like in the case of a List and then iterate through it.

Could please someone explain me this difference? Why we got this behavior and no the opposite one.

Thanks in advance for any help!

Christos
  • 53,228
  • 8
  • 76
  • 108
  • 4
    You do know that your measurements are useless? This is a maxof 1.5ms vs. 0.55ms - all way too small. I would in your case rewrite and rerun the test in a more meaningfull size (so that the runtime is around a second) and see whtehr the error is still there then. – TomTom Mar 23 '14 at 15:34
  • I am getting better results with the Enumerable. – Rafa Paez Mar 23 '14 at 15:37
  • @TomTom I gave it a go using n=1000000 and for the first case was 0.0119866 s and for the second 0.0248129. I did also the same fo n = 100000000 and again I got 2.1709454 s and 2.3085936 s correspondigly. So, I can't still explain it. – Christos Mar 23 '14 at 15:38
  • Ok, that is a lot better. More meaningfull - and at least with the 2 second it geot better. Did you run a profiler and check where the time is spent? – TomTom Mar 23 '14 at 15:41
  • Did you tried with the optimization enabled? For instance, in Release configuration. – Dmitry Mar 23 '14 at 15:43
  • @TomTom nope I didn't run a profiler. But I will try now. – Christos Mar 23 '14 at 15:44
  • 1
    @Dmitry one wierd thing: I was running my code in Debug cofiguration and I changed it to Release. Now for the first case I got 1.73 s and for the second 1.67 !!! I can't get it. – Christos Mar 23 '14 at 15:46
  • @ChristosPaisios are you running it with debuggin enabled? – GregRos Mar 23 '14 at 15:57
  • Yeah, my results are completely different. `IEnumerable` performs around twice as fast in this synthetic scenario. Interestingly though, using `IList` instead of `List` makes `List` perform around 50% more slowly. I suspect the results here are due to having debugging enabled, issues with JITting, GC, different framework version, no optimizations (or hidden optimizations), and/or any of the other issues that plague micro-benchmarking. [HERE](https://gist.github.com/anonymous/80ca574cfbf0d5cc4b83) is the code I ran. – GregRos Mar 23 '14 at 16:16
  • When doing similar measurements, even following the one-second rule you will notice differences when switching the loops, i.e. Test loop#2 first and then loop#1. You will/might also notice differences when running loop1, then loop2, then loop1 again. Some CPU also perform better, if you give them some work before you actually start the test, i.e. wake them up. Your difference is approx 1-5% which is within what I have seen in similar tests (although using a different language). I would call this simply 'measurement noise'. – alzaimar Mar 23 '14 at 16:34

3 Answers3

4

First of all, the way you test is not really able to give you useful impressions of performance differences. An iteration of 10000 items is really too short; you can already see this as you have results in microseconds. Instead you should always try to get multiple seconds out of it. In addition, you should always run the same test multiple times in sequence, and then take the average out of it. That way you remove random influences and get a more stable result (see also the law of large numbers).

But yes, iterating a generator function will probably be slower than a list. This is for different reasons: First of all, as you are taking items from a function which pauses its execution, you essentially end up with a lot of context switches. I’m not sure how optimized this is for generator functions, but you still have to deal with them in some way, so you do have a penality there.

Second, lists internally use arrays which are dynamically resizes as necessary. So in the end, when you are iterating over a list, you are iterating over an array. You are iterating over a sequence of numbers in memory. This will always be faster than anything else.

The big difference, which is what should make you consider generator functions over complete lists, is the memory aspect. When you create the list, you are quickly generating all items, put them into memory, and then iterate over them quickly again. But you also put them all into memory. So depending on the number of items, this can mean a large cost. Especially when you only need to access an item once, that’s often not worth it. On the other hand, generator functions will only require memory for a single item, so memory-wise, this is very efficient.

Finally, while there is a speed difference, this will likely never matter much. It’s rare that an application will be slower because you decided to use a generator function somewhere. It’s more likely, that the bottleneck of your application lies elsewhere, most likely in I/O or network operations, so you really shouldn’t care about it until it becomes a problem.

poke
  • 369,085
  • 72
  • 557
  • 602
  • IMO in this case, interface method calls may have a performance impact. Firstly, calling `MoveNext()` and `Current` involves a vtable lookup or similar, which is important in microbenchmarks such as these. In addition, interface methods probably cannot be inlined by the JITter. – GregRos Mar 23 '14 at 15:53
0

Simple answer.

The list uses a lot of memory, which will overload caches. THe method will not use a lot of memory and thus run in the first level cache of the processor. At least one possible explanation. Especially when you do a LOT more than 1000 numbers.

TomTom
  • 61,059
  • 10
  • 88
  • 148
0

The difference is possibly also because different enumerators are used underneath. For example, IL for enumerating the List<T> looks as follows:

callvirt    System.Collections.Generic.List<System.Int32>.GetEnumerator
stloc.s     04 // CS$5$0000
br.s        IL_0030
ldloca.s    04 // CS$5$0000
call        System.Collections.Generic.List<System.Int32>+Enumerator.get_Current
stloc.3     // number
nop         
nop         
ldloca.s    04 // CS$5$0000
call        System.Collections.Generic.List<System.Int32>+Enumerator.MoveNext
stloc.s     05 // CS$4$0001
ldloc.s     05 // CS$4$0001
brtrue.s    IL_0026
leave.s     IL_004E
ldloca.s    04 // CS$5$0000
constrained. System.Collections.Generic.List<>.Enumerator
callvirt    System.IDisposable.Dispose
nop         
endfinally  

And IL for iterating through IEnumerable<T> as follows:

callvirt    System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator
stloc.s     06 // CS$5$0002
br.s        IL_008E
ldloc.s     06 // CS$5$0002
callvirt    System.Collections.Generic.IEnumerator<System.Int32>.get_Current
stloc.3     // number
nop         
nop         
ldloc.s     06 // CS$5$0002
callvirt    System.Collections.IEnumerator.MoveNext
stloc.s     05 // CS$4$0001
ldloc.s     05 // CS$4$0001
brtrue.s    IL_0084
leave.s     IL_00B1
ldloc.s     06 // CS$5$0002
ldnull      
ceq         
stloc.s     05 // CS$4$0001
ldloc.s     05 // CS$4$0001
brtrue.s    IL_00B0
ldloc.s     06 // CS$5$0002
callvirt    System.IDisposable.Dispose
nop         
endfinally  

As you can see, the former is using call for Current and MoveNext calls but the latter is using callvirt. This is because one is iterating over List<T>.Enumerator which cannot be inherited, the other one is using IEnumerator<T> where inheritance must also be taken into account (for example, you could return you own enumerator which would actually inherit from another one).

Further read on call vs callvirt can be had: call and callvirt.

Community
  • 1
  • 1
Patko
  • 4,365
  • 1
  • 32
  • 27