6

Can somebody explain to me why we constantly use IEnumerable all over C# when, by my logic we should be using List,IList or ICollection instead.

Look at this commit fix. enter image description here

Ok, number 1) IEnumarable should only be used for read only collections!

Great, everybody keeps repeating that (I get it), so why do I need to constantly need putting ToList() on every single database call to prevent mulit-threads creashing when the original thread disposes of context.

Surely once the query has been Enumarated to a List, we should just use List from there on to avoid confusion, thread safety issues and constantly calling ToList() on every single model, over and over.

Also on WebAPI that parse JSON, a text representation of an already enumerated object, we store it as IEnumarable again. We already know the size, we not interested in lazy loading it, we probably going to modify that object somewhere along the line, so why don't we store it as List of.

--EDIT

One important aspect about this question is that I never knew about covariance and contravariance

You cannot do List<IItem> because of runtime problems on the List implementation so you have to use IEnumarable<IItem> - This will only make sense in specific cases where you have the same entity, on two different systems, that need to work on one piece of code. Otherwise I have been using List<> since asking the question and that is probably 90% of the time correct to use in business layers. Database/Repository layers stick to IColletion now or IQueryable.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Piotr Kula
  • 9,597
  • 8
  • 59
  • 85
  • 1
    Why do programmers write buggy code? Well, multiply the typical understanding of threading by IDisposable by async and you get pretty low odds for success. The addition of the async keyword is a rather mixed blessing. It was a boon to programmers that *use* an api, like those that want to write WinRT code and not die trying, it tends to be a pretty blunt instrument to programmers that *create* an api. Every abstraction that tries to make threading looks easy adds five new failure modes that are hard to debug. – Hans Passant Jun 29 '17 at 13:02
  • Yip.. so that kind of means because of that the fundamental rules need to be changed :D – Piotr Kula Jun 29 '17 at 13:14

2 Answers2

12

It's kinda hard to answer this because it's very easy to post an opinion, and not an answer. I'll try though.

This is a fundamental question about responsibility, more specifically, good API design.

Here are my thoughts about the specific corner of the world of API design that this covers:

  • Be as specific as you can in your output types
  • Be as open as you can in your input types

I'll give a few examples. If your method can take any collection, take IEnumerable<T>. If your method needs an indexable collection, take IList<T> or similar. If the first thing the method does is convert the input to a List through .ToList(), be open about it. If I already have a list on the outside and I want to give it to your method, take List<T>.

However, I must be aware of the fact that once you say you take IList<T>, you're able to modify the list. If that is OK, good, I can simply send you my list. Otherwise, I will have to execute .ToList() on the outside. This is not worse than before, it's simply a question about who is responsible for doing so.

On the opposite side, if your method returns an object that is a List<T>, return it as List<T> or IList<T>. Don't hide it behind IEnumerable<T>.

Here you should also be aware that any caller that gets this list back is able to modify it. If this is not OK, don't return it as IList<T>.

Though, you can say in that last example that if that list is not supposed to be modified on the outside, after returning (it's not the callers list), then you should hide it behind IEnumerable<T>, but this only hides the fact, the programmer can still cast it back to IList<T> and modify it. If you can't accept this, return a copy through .ToList().

However, all of this goes back to responsibility.

If your method returns IEnumerable<T>, shouldn't I, the person writing code that is using this method, rely on your method working correctly? If your method won't work because you've already disposed of the database context, then that's on you, your code, your bug, your problem.

Likewise, if I need something that can be indexed into, but other than that is a purely readonly collection, I should design my method to take the appropriate collection interface to signal this. If I design my method to take IEnumerable<T>, I should know that I'm specifically saying "I'll take any lazily produced collection". If I don't know that, or don't want to say that, then I should not take IEnumerable<T>.

So as to answer the underlying question for your specific line of code. Here you absolutely must call .ToList() inside because otherwise your methods is broken, buggy, wrong.

OR, you can redesign your method in such a way that it retains its lazy nature, but doesn't dispose of the database context until it's ready. A simple change to your method in this regard would be to use yield return:

using (var context = ... )
{
    foreach (var element in context.Request(...))
        yield return element;
}

With such a change you're still postponing the actual cost of going to the database until the caller actually iterates over the collection you returned, and you're still disposing of the context correctly (assuming the caller does the iteration correct). Here I would still return IEnumerable<T> to specifically say "I'm lazily evaluated".

However, if the list that you produce internally is not kept, nobody is responsible for it, you're effectively giving the responsibility of this list to the caller, then absolutely should you return it as List<T>. Don't hide it behind IEnumerable<T>.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • 1
    Surprisingly, that makes absolute sense to me now. I never though about it as "responsibility" driven (an nobody really talks about it like that) Everybody just blindly quotes the basic definition without actually being able to prove its the correct type to use. I knew I was wrong, just nobody could prove my theory was wrong. I now know exactly why ToList is hidden behind IEnumarables - Or maybe I am just really slow and it took me over 10 years to finally understand this. Thank you for spending some of your valuable time to explain it to me like this. Really appreciate it. – Piotr Kula Jun 29 '17 at 12:23
  • Its not just about methodology, there are some real performance gains too if you use iterator methods. – Kieran Devlin Jun 29 '17 at 12:24
  • Since you mentioned performance... :) I would have thought that once you have an enumerated list in memory, it will always be faster to access the List in memory, rather than iterating it constantly. (But I know very little about the mechanics of the iterator) – Piotr Kula Jun 29 '17 at 12:26
  • If the `IEnumerable` reference you get back is just a `IList` in disguise, looping over it won't cost more, unless you need to say "give me the 10025th element". – Lasse V. Karlsen Jun 29 '17 at 12:28
  • Well that is what we mostly do.. We create sub sets of data using Linq from the original IEnumarable. For example, `Skip(x)`, `Take(y) ` or even `Where(x = y)` - Sure, we do not explicitly say I want element at Index 10025, but more than half or everybodies code is, we want the Element with the Id=10025 - Technically IEnumarable has to iterate until it finds it and if we want another one it has to re iterate again? Surely, I know all this executes in nanoseconds and has no real life impact on performance, unless you are working with fractals :D – Piotr Kula Jun 29 '17 at 12:32
  • PS Thanks for the Edit. I think that makes sense, if we are returning List from the repository layer then its 100% clear it has been iterated, no lazy loading allowed, the list is the data, end of story. If then it passed around all the other layers as IEnumerable just to mark the collection as read-only - Fine. I'll go with that - But still in the back of my head I know the data source is a List (since we did not build our implementation of IEnumarable and don't plan to) so it will still bug me to death.. But at least now I have some tangible opinion by which I can carry on codding. – Piotr Kula Jun 29 '17 at 12:39
  • Yesterday I found out probably the most important reason for using IEnumarable or IList - and that is all about [covariance and contravariance](https://blogs.msdn.microsoft.com/csharpfaq/2010/02/16/covariance-and-contravariance-faq/) - NOW I properly understand the importance of which one to use when. Safe to say I have been using List<> on all my code since asking the question but yesterday was D day. I wanted to do List and that is a No No, since List is covariant. Makes perfect sense now. – Piotr Kula Jan 10 '18 at 08:07
0

IEnumerable does not explicitly say what collection type the data set is therefore you can do more generic programming rather than having specific functions for specific types.

Kieran Devlin
  • 1,373
  • 1
  • 12
  • 28
  • Awesome.. but why do we constantly need to make everything generic if keeps causing problems :D Espcailly that IEnumarable works completely different than IList .. they not really common, yet IEnumarbale is the smallest interface. It is a bit annoying – Piotr Kula Jun 29 '17 at 12:16
  • 1) What problems does it cause exactly? Only problems I know of are case specific ones. 2) Not all collections are a type of list, therefore you want to use a more generic type i.e IEnumerable. https://stackoverflow.com/questions/3228708/what-should-i-use-an-ienumerable-or-ilist – Kieran Devlin Jun 29 '17 at 12:19