4

I attached memory tests

I edited question

I benchmark ToArray and ToList method on IEnumerable<int>, and I looked pit from 530 to 800 thousand on the graphic. I pinned my benchmark code:

 [MarkdownExporter, AsciiDocExporter, HtmlExporter, CsvExporter, RPlotExporter, PlainExporter] [MemoryDiagnoser]
    public class IntBenchmarks
    {
        private IEnumerable<int> EnumerableInts;


        [Params(
            // 10000 ... 1000000
        )]
        public int _count;

        public IntBenchmarks()
        {
            EnumerableInts = GetEnumerableInts();
        }


        private IEnumerable<int> GetEnumerableInts()
        {
            for (var i = 0; i < _count; i++)
            {
                yield return 1;
            }
        }


        [Benchmark]
        public void ToArrayInt()
        {
            var r = EnumerableInts.ToArray();
        }


        [Benchmark]
        public void ToListInt()
        {
            var r = EnumerableInts.ToList();
        }

    }

Also, I know that memory allocates(there is the memory for a new array) when _count equals 530000. I'm very interested in why performance is better when the memory allocated. I have benchmarks for IEnumerable of class, struct, int string, only IEnumerable of int has such behavior


enter image description here

I checked it repeatedly

Memory tests: enter image description here

Evgeniy Terekhin
  • 596
  • 3
  • 10
  • .NET Framework or .NET Core? – Theodor Zoulias Jan 19 '20 at 10:41
  • I am surprised that `ToArray` is faster. I would bet that `ToList` would be faster because it involves less steps. ([Is it better to call ToList() or ToArray() in LINQ queries?](https://stackoverflow.com/questions/1105990/is-it-better-to-call-tolist-or-toarray-in-linq-queries)). Did you tested it on .NET Framework too? – Theodor Zoulias Jan 19 '20 at 10:47
  • The implementations ```ToList``` are different in .net core and .net framework. I didn't test it on .net framework, but I know that ```ToList``` is better there, because of ```ToArray``` method call ```ToList``` internally. But .net core developers rewrote ```ToArray``` and it's faster now. @TheodorZoulias – Evgeniy Terekhin Jan 19 '20 at 10:57
  • I can see that there are also smaller pits starting at 130,000 and 260,000 items. It seems that the pits are following the resize pattern of the internal buffer (it is doubled in size when it becomes full). – Theodor Zoulias Jan 19 '20 at 12:29
  • Yes, I agree, but I can't understand why? This is JIT trick maybe@TheodorZoulias – Evgeniy Terekhin Jan 19 '20 at 12:40
  • "The Art of Performance Measuring", the book of this [Benchmark] creator, is describing this 'List.Add' in the Pitfalls section. https://books.google.com/books?id=IXCfDwAAQBAJ&pg=PA59&lpg=PA59&dq=array.maxarraylength&source=bl&ots=P6WVcXmAQu&sig=ACfU3U1DH9PY4sAjYQEOTvlQaalzcF1UtA&hl=de&sa=X&redir_esc=y#v=onepage&q=array.maxarraylength&f=false – Holger Jan 19 '20 at 14:05
  • In .NET Core there is a specific optimization for enumerables created with `Enumerable.Range`. There is an internal class [`RangeIterator`](https://source.dot.net/System.Linq/System/Linq/Range.SpeedOpt.cs.html) that knows how to create arrays and lists as fast as possible. I have no idea what causes the pit though. It seems that there is a non-linear correlation between the size of the array and the time needed to allocate and initialize it. Larger arrays are allocated faster than smaller arrays, sometimes. – Theodor Zoulias Jan 19 '20 at 16:44
  • Could you provide an example when allocation larger arrays are faster than smaller, please?@TheodorZoulias – Evgeniy Terekhin Jan 20 '20 at 03:51

2 Answers2

1

The number of items in EnumerableInts is unknown. It's just an enumerable with an arbitrary number of items. Thus, when you call ToArray() the runtime must allocate an array with unknown size. Then it starts copying items from the enumerator. Once the array length has been reached, it will re-allocate a new array with twice the length. See EnumerableHelpers.cs(76)

The larger number of items returned from the enumerator, the more temporary arrays will be allocated. The pits in your benchmark are caused by the Garbage Collector, collecting the temporary arrays that are no longer referenced.

l33t
  • 18,692
  • 16
  • 103
  • 180
0

On a real IEnumerable, that cannnot be casted to anything else, what you enforce by using the yield return iterator method, ToArray is calling ToList, not because of ToList is faster, but it's more flexible, it's prepared to handle an unknown Count of Data. We have to avoid to enumerate the enumeration twice, what would happen if one would use the Count() extension, for example.

ToArray() would be faster - if you know the number of elements in advance. This would be the case if the source is an ICollection - it's actually the only idea of an ICollection to support this function - knowing the number of elements in Advance.

Your measurement is specific to pure Iterators (with yield return). If the IEnumerable is tested true, to be an Array, IList or ICollection, it's rather casted, and the native implementation List.CopyTo and Array.CopyTo is used. They do not enumerate the IEnumerable. Just to emphasize: list.ToList(), and list.ToArray(), or int[].ToList(), or int[].ToArray(), would not show this behaviour

I cannot explain possible differences of .NET/Core.

Holger
  • 2,446
  • 1
  • 14
  • 13
  • You described .net framework behavior. .net core ```ToArray``` method doesn't call ```ToList```. And my question why we have pit from 530 to 800 thousand. Thank you – Evgeniy Terekhin Jan 19 '20 at 11:38
  • Yes, I know, it's not an answer. But it was to much for a comment, and it's important to pronounce you are not measuring general behaviour, but only the case the really enumerates with MoveNext. I only checked the source code, and it doesn't look different to me. – Holger Jan 19 '20 at 13:01
  • Did you watch this [source code](https://source.dot.net/#System.Linq/System/Linq/ToCollection.cs,783a052330e7d48d,references)?@Holger – Evgeniy Terekhin Jan 19 '20 at 13:09
  • @EvgeniyTerekhin Yes, and the https://source.dot.net/#System.Linq/System/Collections/Generic/EnumerableHelpers.cs,a8ba52888e4d53c1 EnumerableHelpers.ToArray, is exactly doing what I'm describing. (added another answer concerning the byte length). We are just joining forces of ideas. – Holger Jan 19 '20 at 13:18