1

I have a collection of strings as an Enumerable (for the example let's say its the result of some linq query on a collection).

IEnumerable<string> myStrings;

What's the difference between the following

a.

result = myStrings as List<string>;

b.

result = myStrings.ToList();

Is one more efficient than the other? Does option b change myStrings itself?

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
  • 6
    The first one is a cast (and may return `null`) the other is a method call that is guaranteed to return a `List` (or throw an `ArgumentNullException`) – UnholySheep Aug 26 '18 at 08:36
  • 1
    Your answer may be in here https://stackoverflow.com/questions/2107115/ooc-what-is-the-difference-between-tolist-and-casting-to-listt-in-net – Md. Abdul Alim Aug 26 '18 at 08:42
  • Option a) will fail on most `IEnumerable`s. Option b) will only fail when `myStrings` is `null`. 'efficient' will amost never be the deciding factor. – H H Aug 26 '18 at 08:46
  • 1
    It's not a matter of efficiency, it's a matter of whether `myStrings` is already a `List` or not. – Patrick Roberts Aug 26 '18 at 08:48

3 Answers3

2

Option a - result=myStrings as List<string> is a safe way to typecast and know whether a given type contains List<string> internally, there's no extra memory allocation, since it's typecasting it can be be done on any type / object. If the typecasting fails then result is null, no exception, which might come if you try (List<string>) myStrings instead.

Infact even better approach - is operator, myStrings is List<string> result (Called as Pattern matching is expressions) which provides boolean result and if true, it leads to valid value inside result variable (is operator is there for quite sometime, but using variable result for pattern matching which spans beyond the if loop and can be used in the logic thereafter is a C# 7.1 feature

Compared to this:

Option b - result=myStrings.ToList() is extension method call on a IEnumerable<T>, which allocates and creates new List<T> data structure. This is always an extra allocation of memory. Following is the source code for the Enumerable.ToList() call, link

 public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source) {
            if (source == null) throw Error.ArgumentNull("source");
            return new List<TSource>(source);
        }
Mrinal Kamboj
  • 11,300
  • 5
  • 40
  • 74
  • "This is always an extra allocation of memory" - good point. I never checked this but I always assumed that `ToList` would check if `source` is already a list and just return it. – CompuChip Aug 26 '18 at 08:57
  • As the code suggest it always does `new List`, so that the caller can work with new memory without worry about making modification to the original source. If we need existing collection, It needs typecasting, which is a good optimization step for large collections – Mrinal Kamboj Aug 26 '18 at 09:00
  • Libraries like `Dapper` offer `AsList`, which first try to type cast and then to `ToList`, if type casting fails. – Mrinal Kamboj Aug 26 '18 at 09:07
1

ToList() is a method and creates new object of List<T>. That method require elements in the collection to be correct type.

Method ToList iterates over the collection to create new list, so it's slower than just casting, but safer and more generic.

Because the method creates new list, changes to original collections are not reflected in the list.

as List<T> casts the object on that List. That means collection must either be a List or inherit from it.

Casting can sometimes result in null when the object is not possible to cast on List<T>

Because casting of object still holds the reference, changes to original collection are reflected in the list.


More information:

ToList():

https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.tolist?redirectedfrom=MSDN&view=netframework-4.7.2#System_Linq_Enumerable_ToList__1_System_Collections_Generic_IEnumerable___0__

as:

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/as

madoxdev
  • 3,770
  • 1
  • 24
  • 39
0

ToList will create a new list but in your case, since you are referencing a reference type (string), then the new list will contain references to the same objects as the original list.

Updating the myStrings property of an object referenced in the new list will also affect the equivalent object in the original list.

So there is no immediate difference between the two.

that answers your question to whether option b change myStrings itself.

regarding the performance, yes, there will be. As referenced in this post:

Yes, IEnumerable<T>.ToList() does have a performance impact, it is an O(n) operation though it will likely only require attention in performance critical operations.

The ToList() operation will use the [List(IEnumerable<T> collection)][2] constructor. This constructor must make a copy of the array (more generally IEnumerable<T>), otherwise future modifications of the original array will change on the source T[] also which wouldn't be desirable generally.

this will answer your second question, whether performance impacts will take place.

Barr J
  • 10,636
  • 1
  • 28
  • 46