3

I have a sequence of items represented by IEnumerable

I need to loop thru these items, and It should be a for-loop because the index is important.

My question is, is there a difference in performance between the following 2 options?

1.

for (int i = 0; i < items.Count(); i++)
{
    //Do something
}
var itemsLength = items.Count();

for (int i = 0; i < itemsLength; i++)
{
    //Do something
}

In other words, does the method items.Count() run again and again on each iteration in option 1?

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
SeReGa
  • 1,219
  • 2
  • 11
  • 32
  • 2
    You might clarify how you're populating the `IEnumerable` since, by definition, there is no "index" in an enumerable object. You just get the next item. This is different if you have an `Array`, `List`, etc which implements the `IEnumerable` interface but also has an indexed collection. Additionally, if your enumerable [can rely on e.g. `List.Count`, there won't be a huge difference](/a/853478/14956277) between your approaches. If instead your enumerable has to perform some deferred work, then the second [might](/a/169699/14956277) do twice as much work as the first. – D M Jan 31 '22 at 20:44
  • It doesn't have to be `for`. LINQ's Select for example has a `Select((item,index)=>...)` overload. What you attempt to do can result in N^2 iterations - each `Count()` will iterate everything from the start, and accessing an element by index through an Enumerable will iterate to the index. – Panagiotis Kanavos Jan 31 '22 at 21:08
  • @DM, In fact, the origin is a `List`, but I cannot rely on it. Could you please explain what do you mean "some deferred work"? – SeReGa Jan 31 '22 at 21:37

2 Answers2

7

Calling Count() on an enumerable source will exhaust the enumerable until all elements are enumerated. Calling Count() on an enumerable in the condition-block in a for loop will therefore exhaust the enumerable at every iteration. For example, calling

var numbers = VerboseRange(1, 5);
for (var index = 0; index < numbers.Count(); index++)
{
    Console.WriteLine($"For-loop is at index {index}...");
}

IEnumerable<int> VerboseRange(int start, int count)
{
    foreach (var number in Enumerable.Range(start, count))
    {
        Console.WriteLine($"Yielded number {number}.");
        yield return number;
    }
}

will output

Yielded number 1.
Yielded number 2.
Yielded number 3.
Yielded number 4.
Yielded number 5.
For-loop is at index 0...
Yielded number 1.
Yielded number 2.
Yielded number 3.
Yielded number 4.
Yielded number 5.
For-loop is at index 1...
Yielded number 1.
Yielded number 2.
Yielded number 3.
Yielded number 4.
Yielded number 5.
For-loop is at index 2...
Yielded number 1.
Yielded number 2.
Yielded number 3.
Yielded number 4.
Yielded number 5.
For-loop is at index 3...
Yielded number 1.
Yielded number 2.
Yielded number 3.
Yielded number 4.
Yielded number 5.
For-loop is at index 4...
Yielded number 1.
Yielded number 2.
Yielded number 3.
Yielded number 4.
Yielded number 5.

Therefore, counting before is better.

However, I would recommend you to use a counter and a foreach-loop

var count = 0;
foreach (var item in items)
{
    // do something
    count++;
}

In C# 7.0, you can finally do

foreach (var (item, index) in items.WithIndex())
{
    // do something
}
Thomas Angeland
  • 441
  • 5
  • 10
6

Usually when it comes to these type of performance questions, it's just easier to test it. Using BenchmarkDotnet :

|  Method |      Mean |     Error |    StdDev | Ratio | RatioSD | Allocated |
|-------- |----------:|----------:|----------:|------:|--------:|----------:|
| Option2 |  49.40 ns |  0.700 ns |  0.654 ns |  1.00 |    0.00 |         - |
| Option1 | 955.00 ns | 16.993 ns | 15.064 ns | 19.30 |    0.34 |         - |

Option1 is 20x slower. Here the IEnumerable was generated using Enumerable.Range(0,100)

I then also tested to see what would happen if your actual type was a List and whether the Count() is optimized to just return the Count property. These were the results:

|  Method |      Mean |    Error |   StdDev | Ratio | RatioSD | Allocated |
|-------- |----------:|---------:|---------:|------:|--------:|----------:|
| Option2 |  40.53 ns | 0.246 ns | 0.192 ns |  1.00 |    0.00 |         - |
| Option1 | 452.07 ns | 4.906 ns | 4.097 ns | 11.14 |    0.12 |         - |

This time option1 is 11x slower, seems like it's only due to not having to generate the full 100 items enumerable.

Test cases used:

[Benchmark]
public static int Option1()
{
    var x = 0;
    for (int i = 0; i < Data.Count(); i++)
    {
        x++;
    }

    return x;
}

[Benchmark(Baseline = true)]
public static int Option2()
{
    var itemsLength = Data.Count();
    var x = 0;
    for (int i = 0; i < itemsLength; i++)
    {
        x++;
    }

    return x;
}

JohanP
  • 5,252
  • 2
  • 24
  • 34
  • Thank you. It looks like calculating the count before is better anyway. But what if you create a larger range? 10000 items for example. Is it the same ratio? Or Option 2 will finish within the same time? – SeReGa Jan 31 '22 at 22:12