3

List Benchmark: Size=1000, Runs=50000, Total Time = 19.5 sec

var list = new List<string>(Size);
for (int i = 0; i < Size; i++) list.Add(i.ToString());

var b = new Benchmark();
b.Test("TestListIteration", () =>
{
   c = 0;
   for (int i = 0; i < Runs; i++)
   {
      for (int j = 0; j < Size; j++)
      {
          c += list[j].Length;
      }
   }
});

List To Array Benchmark: Size=1000, Runs=50000, Total Time=15.449

var list = new List<string>(Size);
for (int i = 0; i < Size; i++) list.Add(i.ToString());

var b = new Benchmark();
b.Test("TestListIteration", () =>
{
   c = 0;
   for (int i = 0; i < Runs; i++)
   {
      var array = list.ToArray(); //Changed line here !!!
      for (int j = 0; j < Size; j++)
      {
          c += array[j].Length;
      }
   }
});

Isn't this a paradox ?

How can performing two actions
a) converting entire list to array and
b) iterating entire array.
Be faster than doing b alone (iterating list).

If this is the case, then it means than all code written in the world is wrong. We should have predicted for this case. And each "For" loop on a list should automatically call .ToArray before it starts. Even if the array is discarded later.

Edit: Here are the Results depending on the "Size".

Size=10, Runs=5000000: List Wins List : 20.362, ListToArray: 37.36

Size=100, Runs=500000: List Wins List : 19.64, ListToArray: 23.162

Size=1000, Runs=50000: ListToArray Wins List : 19.5, ListToArray: 15.449

Size=10000, Runs=5000: ListToArray Wins List : 20.094, ListToArray: 14.453

Size=10000000, Runs=5: Computer died

zafena
  • 43
  • 5
  • 1
    I think you are converting an anonymous linq q object to a c# array. Index a real array is faster then an anonymous array. – jdweng Jun 06 '17 at 16:32
  • `list[j]` presumably implicitly casts the list to an array, as index-based retrieval has been the way to interact with arrays, not lists. This means that you are implicitly casting the list to an array for every iteration, as opposed to doing it once and then using the array without a need to still cast it. – Flater Jun 06 '17 at 16:32
  • @Flater A cast? This is just the indexer of `List`. https://msdn.microsoft.com/en-us/library/0ebtbkkc(v=vs.110).aspx – adjan Jun 06 '17 at 16:35
  • 3
    The underlying implementation of `List` **is** an array. So doing `list[i]` is making at least two function calls (the [] of the `list[i]` is a function call, then another function call internally to get the underlying array) and then array subscripting where `array[i]` is doing just array subscripting. – GreatAndPowerfulOz Jun 06 '17 at 16:42
  • i do not understand what I am doing that you consider is wrong. I am not a c# expert. @jdweng you mentioned linq. where exactly did I use linq. I repeat I am not expert. You can repeat the benchmark test outside the Benchmark.test() expression, I doubt the result will change... – zafena Jun 06 '17 at 16:47
  • Possible duplicate of [Performance of Arrays vs. Lists](https://stackoverflow.com/questions/454916/performance-of-arrays-vs-lists) – Rufus L Jun 06 '17 at 17:01
  • Indexed access to a list happens via a function call, whereas indexed access to an array is implemented directly in IL. The cost of theses method calls is likely to be your main culprit here. – spender Jun 06 '17 at 17:01
  • @Rufus L Here the test is "List" vs "List + .ToArray". And "List + .ToArray" wins. This isn't explained in that post? It performs 1 iteration operation there. Here convertion + iteration is faster than iteration. – zafena Jun 06 '17 at 17:08
  • Yes. Arrays are faster when you have a fixed length and are processing in a tight loop. This is what the answer to the other question states. – Rufus L Jun 06 '17 at 17:11
  • I suspect that JIT inlining may skew the benchmark, however i have no way to prove it, any ideas ? – zafena Jun 06 '17 at 17:12
  • And, for what it's worth, when I run your code using a stopwatch, the time difference is only about **40 miliseconds**. `List processing = 357 ms; Array processing = 316 ms` where `var Size = 1000;` and `var Runs = 50000;` – Rufus L Jun 06 '17 at 17:16
  • Have you tried benchmarking ToArray & Array indexing separately? Array indexing is cheaper than list indexing - seems quite probable that for a small n array copy is cheaper than the list indexing overhead. – user6144226 Jun 06 '17 at 17:32
  • 1
    I did the benchmark multiple times, it seems list is faster for smaller sizes that list + toArray – zafena Jun 06 '17 at 17:43
  • 2
    Benchmarking microoptimizations is hard. When you're testing code that has completely miniscule differences in performance, you're simply going to have more general noise unrelated to your change that will more than drown it out unless you're *very* skilled at your tests (and sometimes even then). – Servy Jun 06 '17 at 17:44

1 Answers1

16

Isn't this a paradox?

No.

How can performing two actions a) converting entire list to array and b) iterating entire array. Be faster than doing b alone (iterating list)?

Converting a list to an array is extremely fast.

Iterating elements of a list is very slightly slower than iterating elements of an array.

Why are these things true?

Because (1) lists are secretly arrays behind the scenes, (2) array-to-array copy is heavily optimized -- it is done by going directly to hardware, not by iterating every element and copying it one at a time -- so list-to-array copying is also heavily optimized, and (3) list indexing is just array indexing plus a little more work, so list indexing must be slightly slower.

As you have discovered, when you balance out one very fast thing against a great many slightly slower things, there comes a point where one will win over the other.

If this is the case, then it means than all code written in the world is wrong.

No it does not.

We should have predicted for this case.

No, we should not have worried about it. Don't believe me? Just find me a product in the marketplace today whose success or failure was predicated entirely upon choosing the list iteration technique that was a couple milliseconds faster.

And each "For" loop on a list should automatically call .ToArray before it starts.

Absolutely not. Even if we knew it was faster, which we don't, there are many kinds of performance to worry about. Doubling the amount of memory in use in order to save a couple milliseconds works directly against the performance goal of minimizing memory usage! Not everyone cares about raw speed.

In your case the difference was 4000 milliseconds over 50000 runs, so that's less than a tenth of a millisecond saved on each run. If you care about that tenth of a millisecond, then by all means make this change to your code.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067