200

I have the following function to get validation errors for a card. My question relates to dealing with GetErrors. Both methods have the same return type IEnumerable<ErrorInfo>.

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    var errors = GetMoreErrors(card);
    foreach (var e in errors)
        yield return e;
    
    // further yield returns for more validation errors
}

Is it possible to return all the errors in GetMoreErrors without having to enumerate through them?

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
John Oxley
  • 14,698
  • 18
  • 53
  • 78
  • 4
    whats wrong with *return GetMoreErrors(card);* ? – Sam Saffron Aug 13 '09 at 04:54
  • 10
    @Sam: "further yield returns for more validation errors" – Jon Skeet Aug 13 '09 at 05:30
  • 1
    From the standpoint of a non-ambiguous language, one issue is that the method can't know if there is anything that implements both T and IEnumerable. So you need a different construct in the yield. That said, it sure would be nice to have a way to do this. Yield return yield foo, perhaps, where foo implements IEnumerable? – William Jockusch Jan 01 '16 at 09:38
  • 2
    For those interested, the C# language feature request for this is located here: https://github.com/dotnet/csharplang/issues/378, I believe. – StayOnTarget Sep 01 '20 at 16:16
  • For reference issue 378, was actually original a more complex edge case of this with performance considerations (something about recursive calls? idk.). Off the back of that conversation I've raised a dedicated discussion solely for pitching the change as a purely syntactic sugar: https://github.com/dotnet/csharplang/discussions/5303 – Brondahl Oct 19 '21 at 18:00

6 Answers6

172

It is something that F# supports with yield! for a whole collection vs yield for a single item. (That can be very useful in terms of tail recursion...)

Unfortunately it's not supported in C#.

However, if you have several methods each returning an IEnumerable<ErrorInfo>, you can use Enumerable.Concat to make your code simpler:

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    return GetMoreErrors(card).Concat(GetOtherErrors())
                              .Concat(GetValidationErrors())
                              .Concat(AnyMoreErrors())
                              .Concat(ICantBelieveHowManyErrorsYouHave());
}

There's one very important difference between the two implementations though: this one will call all of the methods immediately, even though it will only use the returned iterators one at a time. Your existing code will wait until it's looped through everything in GetMoreErrors() before it even asks about the next errors.

Usually this isn't important, but it's worth understanding what's going to happen when.

vvvvv
  • 25,404
  • 19
  • 49
  • 81
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 3
    Wes Dyer has an interesting article mentioning this pattern. http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx – JohannesH Aug 26 '09 at 04:06
  • 1
    Minor correction for passers by - it's System.Linq.Enumeration.Concat<>(first,second). Not IEnumeration.Concat(). – redcalx Apr 18 '10 at 23:01
  • @the-locster: I'm not sure what you mean. It's definitely Enumerable rather than Enumeration. Could you clarify your comment? – Jon Skeet Apr 19 '10 at 06:39
  • 1
    @Jon Skeet - What exactly do you mean that it will call the methods immediately? I ran a test and it looks like it's deferring the method calls completely until something is actually iterated. Code here: http://pastebin.com/0kj5QtfD – Steven Oxley Aug 09 '10 at 23:20
  • 7
    @Steven: Nope. It's *calling* the methods - but in your case `GetOtherErrors()` (etc) are deferring their *results* (as they're implemented using iterator blocks). Try changing them to return a new array or something like that, and you'll see what I mean. – Jon Skeet Aug 10 '10 at 05:22
  • 1
    @Jon OK, I get it. I guess I overlooked the fact that an array implements IEnumerable as well, and that calling a method and getting the results of a method are two different things (I'm new to the idea of iterator blocks). Thanks for clarifying. – Steven Oxley Aug 10 '10 at 12:07
  • Archived link to the article mentioned as that link is broken: https://web.archive.org/web/20100118093338/http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx – bracco23 Jun 08 '23 at 13:48
  • Also updated link directly on ms website: https://learn.microsoft.com/en-us/archive/blogs/wesdyer/all-about-iterators – bracco23 Jun 08 '23 at 13:49
37

You could set up all the error sources like this (method names borrowed from Jon Skeet's answer).

private static IEnumerable<IEnumerable<ErrorInfo>> GetErrorSources(Card card)
{
    yield return GetMoreErrors(card);
    yield return GetOtherErrors();
    yield return GetValidationErrors();
    yield return AnyMoreErrors();
    yield return ICantBelieveHowManyErrorsYouHave();
}

You can then iterate over them at the same time.

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    foreach (var errorSource in GetErrorSources(card))
        foreach (var error in errorSource)
            yield return error;
}

Alternatively you could flatten the error sources with SelectMany.

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    return GetErrorSources(card).SelectMany(e => e);
}

The execution of the methods in GetErrorSources will be delayed too.

Adam Boddington
  • 6,750
  • 2
  • 21
  • 12
22

I came up with a quick yield_ snippet:

yield_ snipped usage animation

Here's the snippet XML:

<?xml version="1.0" encoding="utf-8"?>
<CodeSnippets xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
  <CodeSnippet Format="1.0.0">
    <Header>
      <Author>John Gietzen</Author>
      <Description>yield! expansion for C#</Description>
      <Shortcut>yield_</Shortcut>
      <Title>Yield All</Title>
      <SnippetTypes>
        <SnippetType>Expansion</SnippetType>
      </SnippetTypes>
    </Header>
    <Snippet>
      <Declarations>
        <Literal Editable="true">
          <Default>items</Default>
          <ID>items</ID>
        </Literal>
        <Literal Editable="true">
          <Default>i</Default>
          <ID>i</ID>
        </Literal>
      </Declarations>
      <Code Language="CSharp"><![CDATA[foreach (var $i$ in $items$) yield return $i$$end$;]]></Code>
    </Snippet>
  </CodeSnippet>
</CodeSnippets>
riQQ
  • 9,878
  • 7
  • 49
  • 66
John Gietzen
  • 48,783
  • 32
  • 145
  • 190
9

I'm surprised no one has thought to recommend a simple Extension method on IEnumerable<IEnumerable<T>> to make this code keep its deferred execution. I'm a fan of deferred execution for many reasons, one of them is that the memory footprint is small even for huge-mongous enumerables.

public static class EnumearbleExtensions
{
    public static IEnumerable<T> UnWrap<T>(this IEnumerable<IEnumerable<T>> list)
    {
        foreach(var innerList in list)
        {
            foreach(T item in innerList)
            {
                yield return item;
            }
        }
    }
}

And you could use it in your case like this

private static IEnumerable<ErrorInfo> GetErrors(Card card)
{
    return DoGetErrors(card).UnWrap();
}

private static IEnumerable<IEnumerable<ErrorInfo>> DoGetErrors(Card card)
{
    yield return GetMoreErrors(card);

    // further yield returns for more validation errors
}

Similarly, you can do away with the wrapper function around DoGetErrors and just move UnWrap to the callsite.

huysentruitw
  • 27,376
  • 9
  • 90
  • 133
Frank Bryce
  • 8,076
  • 4
  • 38
  • 56
  • 6
    Probably nobody thought about an Extension method because `DoGetErrors(card).SelectMany(x => x)` does the same and preserves the deferred behavior. Which is exactly what Adam suggests in [his answer](http://stackoverflow.com/a/22912410/1300910). – huysentruitw Oct 25 '16 at 09:18
8

I don't see anything wrong with your function, I'd say that it is doing what you want.

Think of the Yield as returning an element in the final Enumeration each time that it is invoked, so when you have it in the foreach loop like that, each time it is invoked it returns 1 element. You have the ability to put conditional statements in your foreach to filter the resultset. (simply by not yielding on your exclusion criteria)

If you add subsequent yields later in the method, it will continue to add 1 element to the enumeration, making it possible to do things like...

public IEnumerable<string> ConcatLists(params IEnumerable<string>[] lists)
{
  foreach (IEnumerable<string> list in lists)
  {
    foreach (string s in list)
    {
      yield return s;
    }
  }
}
Tim Jarvis
  • 18,465
  • 9
  • 55
  • 92
3

Yes it is possible to return all errors at once. Just return a List<T> or ReadOnlyCollection<T>.

By returning an IEnumerable<T> you're returning a sequence of something. On the surface that may seem identical to returning the collection, but there are a number of difference, you should keep in mind.

Collections

  • The caller can be sure that both the collection and all the items will exist when the collection is returned. If the collection must be created per call, returning a collection is a really bad idea.
  • Most collections can be modified when returned.
  • The collection is of finite size.

Sequences

  • Can be enumerated - and that is pretty much all we can say for sure.
  • A returned sequence itself cannot be modified.
  • Each element may be created as part of running through the sequence (i.e. returning IEnumerable<T> allows for lazy evaluation, returning List<T> does not).
  • A sequence may be infinite and thus leave it to the caller to decide how many elements should be returned.
Brian Rasmussen
  • 114,645
  • 34
  • 221
  • 317
  • Returning a collection can result in unreasonable overhead if all the client really needs is to enumerate through it, since you allocate the data structures for all elements in advance. Also, if you delegate to another method that's returning a sequence, then capturing it as a collection involves extra copying, and you do not know how many items (and thus how much overhead) this may potentially involve. Thus, it is only a good idea to return collection when it is already there and can be returned directly without copying (or wrapped as readonly). In all other cases, sequence is a better choice – Pavel Minaev Aug 13 '09 at 05:58
  • I agree, and if you got the impression that I said returning a collection is always a good idea you missed my point. I was trying to highlight the fact that there are differences between returning a collection and returning a sequence. I will try to make it clearer. – Brian Rasmussen Aug 13 '09 at 06:06