23

I'm looking for an easy way to clone off an IEnumerable<T> parameter for later reference. LINQ's ToArray extension method seems like a nice, concise way to do this.

However, I'm not clear on whether it's always guaranteed to return a new array instance. Several of the LINQ methods will check the actual type of the enumerable, and shortcut if possible; e.g., Count() will see if the method implements ICollection<T>, and if so, will directly read its Count property; it only iterates the collection if it has to.

Given that mindset of short-circuiting where practical, it seems that, if I call ToArray() on something that already is an array, ToArray() might short-circuit and simply return the same array instance. That would technically fulfull the requirements of a ToArray method.

From a quick test, it appears that, in .NET 4.0, calling ToArray() on an array does return a new instance. My question is, can I rely on this? Can I be guaranteed that ToArray will always return a new instance, even in Silverlight and in future versions of the .NET Framework? Is there documentation somewhere that's clear on this point?

Joe White
  • 94,807
  • 60
  • 220
  • 330

3 Answers3

28

For non-empty collections, ToArray will always return a new array - making it change to return an existing value would be a horribly breaking change, and I'm utterly convinced that the .NET team wouldn't do this. It's an important thing to be able to rely on, in terms of the effect of modifying the resulting array. It's a shame it's not documented :(

There are lots of subtle bits of behaviour in LINQ to Objects which probably aren't worth relying on, but in this case it's such a massive bit of behaviour, I would be absolutely astonished for it to change.

Short-circuiting is great when it doesn't affect behaviour, but generally LINQ to Objects is pretty good about only optimizing in valid cases. You might want to look at the two posts in my Edulinq series covering optimization.

For empty source collections, some versions will return a different empty array on each call, and some will return the same empty array on each call. While that could break code, it's much less problematic than a change of implementation to say "if it's already an array, just return it" would be: empty arrays are naturally immutable, so the only way you'll be able to observe the difference is if you compare for reference identity.

Example:

var empty1 = new string[0];
var empty2 = new string[0];

var array1 = empty1.ToArray();
var array2 = empty2.ToArray();

// Prints True in some versions, and False in others
Console.WriteLine(ReferenceEquals(array1, array2));
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Came searching cuz I wasn't 100% sure. XML comment on the method says `Creates an array from a System.Collections.Generic.IEnumerable\`1` so it's fairly "kinda documented" and "fairly convincing" (at least these days) but certainly I needed some more assurance. My use case was doing `.ToArray()` on a collection of EF entities before looping over and deleting them so I can not be removing an item from the thing I'm iterating. Definitely wouldn't expect this to change in the fx as it'd break A LOT of stuff for folks. – benmccallum Oct 08 '20 at 10:34
16

Since that ToArray method is internal to the .NET framework, I wouldn't stake my life on MS never changing it. However, what I would do, is to add a Unit Test asserting that ToArray returns a new array instance.

Assert.AreNotSame(myArray, myArray.ToArray());

That way, if you later change .NET framework versions, you will automatically know if the functionality changes.

CodingWithSpike
  • 42,906
  • 18
  • 101
  • 138
0

Update: Jon has updated the accepted answer to cover this case.

The accepted answer is wrong as of this writing.

If your IEnumerable<T> is empty, you'll get back the singleton Array.Empty<T> instance.

https://source.dot.net/#System.Linq/EnumerableHelpers.Linq.cs,75

via https://source.dot.net/#System.Linq/System/Linq/ToCollection.cs,10

Matt Jenkins
  • 2,824
  • 1
  • 30
  • 34
  • While this is interesting, it doesn't change the validity of the answer. – Enigmativity Apr 13 '22 at 01:11
  • 1
    I've edited the answer now. – Jon Skeet Apr 13 '22 at 05:59
  • @Enigmativity I don't follow. The accepted answer had previously stated that "`ToArray` will *always* return a new array." If it *sometimes* returns the same empty instance, then how is that not invalid? The reason I arrived here was because I had learned that `ToArray` always returns a new array instance, and was surprised to track down a rare bug where that fact was relied upon to distinguish array instances by reference. Jon has already updated his answer, and even mentioned exactly the unusual case I had encountered! – Matt Jenkins Apr 13 '22 at 18:05