Is yield useful outside of LINQ?

Question

When ever I think I can use the yield keyword, I take a step back and look at how it will impact my project. I always end up returning a collection instead of yeilding because I feel the overhead of maintaining the state of the yeilding method doesn't buy me much. In almost all cases where I am returning a collection I feel that 90% of the time, the calling method will be iterating over all elements in the collection, or will be seeking a series of elements throughout the entire collection.

I do understand its usefulness in linq, but I feel that only the linq team is writing such complex queriable objects that yield is useful.

Has anyone written anything like or not like linq where yield was useful?

Did you mean outside of Linq, or IEnumerable? I should imagine that uses of yield other than in enumerators would be pretty rare (and interesting). Jon Skeet mentions one in his book... — Benjol, Jun 04 '09 at 13:15
very interesting usage of yield's is in Jeffrey Richter's [Power Threading Library](http://msdn.microsoft.com/en-us/magazine/cc546608.aspx) — Yuriy Zanichkovskyy, Jan 27 '10 at 16:10

James Curran · Answer 1 · 2008-11-25T15:50:21.757

28

Note that with yield, you are iterating over the collection once, but when you build a list, you'll be iterating over it twice.

Take, for example, a filter iterator:

IEnumerator<T>  Filter(this IEnumerator<T> coll, Func<T, bool> func)
{
     foreach(T t in coll)
        if (func(t))  yield return t;
}

Now, you can chain this:

 MyColl.Filter(x=> x.id > 100).Filter(x => x.val < 200).Filter (etc)

You method would be creating (and tossing) three lists. My method iterates over it just once.

Also, when you return a collection, you are forcing a particular implementation on you users. An iterator is more generic.

edited Nov 25 '08 at 15:50

answered Nov 25 '08 at 15:05

James Curran

101,701
37
181
258

Wouldn't performing that filter be more straight forward with linq though? – Bob Nov 25 '08 at 17:01
6

That filter is basically what the LINQ Where extension method is. – Thedric Walker Nov 25 '08 at 17:29
That is my point though, I think it would be more straight forward to use linq, would you ever write that filtering code instead of using linq? What benifits would you get? – Bob Nov 25 '08 at 18:10
1

@Bob "Linq" is "Language Integrate Query", i.e. specifically the keywords "from", "where" "orderby" etc. They are changed by the compiler into a chained expression similar to the one in the answer. They are equivalent. The Filter method was included just as an example. – James Curran Sep 16 '15 at 15:53

score 19 · Answer 2 · answered Dec 02 '08 at 23:51

I do understand its usefulness in linq, but I feel that only the linq team is writing such complex queriable objects that yield is useful.

Yield was useful as soon as it got implemented in .NET 2.0, which was long before anyone ever thought of LINQ.

Why would I write this function:

IList<string> LoadStuff() {
  var ret = new List<string>();
  foreach(var x in SomeExternalResource)
    ret.Add(x);
  return ret;
}

When I can use yield, and save the effort and complexity of creating a temporary list for no good reason:

IEnumerable<string> LoadStuff() {
  foreach(var x in SomeExternalResource)
    yield return x;
}

It can also have huge performance advantages. If your code only happens to use the first 5 elements of the collection, then using yield will often avoid the effort of loading anything past that point. If you build a collection then return it, you waste a ton of time and space loading things you'll never need.

I could go on and on....

I do believe that Anders Hejlsberg was working on Linq several years ago. — Tom Stickel, Nov 17 '11 at 23:47

Morten Christiansen · Accepted Answer · 2008-11-25T17:59:08.190

12

I recently had to make a representation of mathematical expressions in the form of an Expression class. When evaluating the expression I have to traverse the tree structure with a post-order treewalk. To achieve this I implemented IEnumerable<T> like this:

public IEnumerator<Expression<T>> GetEnumerator()
{
    if (IsLeaf)
    {
        yield return this;
    }
    else
    {
        foreach (Expression<T> expr in LeftExpression)
        {
            yield return expr;
        }
        foreach (Expression<T> expr in RightExpression)
        {
            yield return expr;
        }
        yield return this;
    }
}

Then I can simply use a foreach to traverse the expression. You can also add a Property to change the traversal algorithm as needed.

edited Nov 25 '08 at 17:59

answered Nov 25 '08 at 17:04

Morten Christiansen

19,002
22
69
94

1

C# really needs a yieldcollection keyword to abstract out the foreach(x in collection){ yield x } loops that everyone writes 100x a day these days :-( – Orion Edwards Dec 02 '08 at 23:53
3

if you are just doing foreach(x in collection) {yield return x;} ... you can just do .Select(x=>x). if you want to do work against a set of items in a collection you can make an extension method .Foreach(IEnumerable col, Action action) – Matthew Whited Jan 27 '10 at 16:16

score 11 · Answer 4 · edited Feb 06 '13 at 05:16

11

At a previous company, I found myself writing loops like this:

for (DateTime date = schedule.StartDate; date <= schedule.EndDate; 
     date = date.AddDays(1))

With a very simple iterator block, I was able to change this to:

foreach (DateTime date in schedule.DateRange)

It made the code a lot easier to read, IMO.

edited Feb 06 '13 at 05:16

Nawaz

353,942
115
666
851

answered Nov 25 '08 at 15:24

Jon Skeet

1,421,763
867
9,128
9,194

2

Wow - Jon Skeet code I don't agree with! =X From the first example it's obvious that you're iterating over days, but that clarity is missing in the second. I'd use something like 'schedule.DateRange.Days()' to avoid ambiguity. – Erik Forbes Nov 25 '08 at 17:29
That would require more than just implementing a single property, of course. I'd say that it's obvious that a DateRange is a range of dates, which are days, but it's a subjective thing. It might have been called "Dates" rather than DateRange - not sure. Either way, it's less fluff than the original. – Jon Skeet Nov 25 '08 at 17:37
Yeah, that's true. *shrugs* I wouldn't personally be satisfied with it, but if it's clear to the author and any future maintainers, then it doesn't really matter. – Erik Forbes Nov 25 '08 at 17:48
Also, I am just splitting hairs - your example demonstrates the usefulness of iterator blocks, and that's what's important for this question. Sorry to nit-pick. =X – Erik Forbes Nov 25 '08 at 17:49
2

Nitpicking is fine, splitting hairs is good, comments and suggestions on coding style are always welcome :) – Jon Skeet Nov 25 '08 at 20:18
I would disagree that the for loop is actually more clear about it being "days" than the foreach loop. StartDate and EndDate still refer to "DateTime values"...which does not implicitly infer days. You could have multiple DateTime values in the same day for different hours. The only true source of "day" in the for loop version is the behavior INSIDE the loop...not the loop itself. If the same code was used inside the foreach loop, the same clarity would exist there as well. – jrista Feb 15 '13 at 19:01
@jrista: The fact that the property is named `DateRange` is enough to say it's dates, for me. Of course these days I'd absolutely want it to be an `IEnumerable` using Noda Time :) – Jon Skeet Feb 15 '13 at 22:10
Aye, a LocalDate enumerable would certainly remove any ambiguity. :) I do find it interesting how context affects perception in cases like these, though. The `for` case had richer context than the `foreach` case, with a marked difference in perceived clarity and interpretation (despite the lack of any *real* difference between the two loops at all). I think the context of a piece of code is often an ignored or loosely recognized concept that can be quite critical to a broad base of readers understanding of that code. – jrista Feb 16 '13 at 01:05
Which, BTW, I am not saying to degrade your answer. I just mean in the broader context of writing code that is maintainable and understandable. ;P – jrista Feb 16 '13 at 01:06

score 8 · Answer 5 · answered Nov 25 '08 at 16:12

yield was developed for C#2 (before Linq in C#3).

We used it heavily in a large enterprise C#2 web application when dealing with data access and heavily repeated calculations.

Collections are great any time you have a few elements that you're going to hit multiple times.

However in lots of data access scenarios you have large numbers of elements that you don't necessarily need to pass round in a great big collection.

This is essentially what the SqlDataReader does - it's a forward only custom enumerator.

What yield lets you do is quickly and with minimal code write your own custom enumerators.

Everything yield does could be done in C#1 - it just took reams of code to do it.

Linq really maximises the value of the yield behaviour, but it certainly isn't the only application.

score 2 · Answer 6 · answered Jan 27 '10 at 15:41

I am a huge Yield fan in C#. This is especially true in large homegrown frameworks where often methods or properties return List that is a sub-set of another IEnumerable. The benefits that I see are:

the return value of a method that uses yield is immutable
you are only iterating over the list once
it a late or lazy execution variable, meaning the code to return the values are not executed until needed (though this can bite you if you dont know what your doing)
of the source list changes, you dont have to call to get another IEnumerable, you just iterate over IEnumeable again
many more

One other HUGE benefit of yield is when your method potentially will return millions of values. So many that there is the potential of running out of memory just building the List before the method can even return it. With yield, the method can just create and return millions of values, and as long the caller also doesnt store every value. So its good for large scale data processing / aggregating operations

score 2 · Answer 7 · answered Nov 25 '08 at 15:21

Whenever your function returns IEnumerable you should use "yielding". Not in .Net > 3.0 only.

.Net 2.0 example:

  public static class FuncUtils
  {
      public delegate T Func<T>();
      public delegate T Func<A0, T>(A0 arg0);
      public delegate T Func<A0, A1, T>(A0 arg0, A1 arg1);
      ... 

      public static IEnumerable<T> Filter<T>(IEnumerable<T> e, Func<T, bool> filterFunc)
      {
          foreach (T el in e)
              if (filterFunc(el)) 
                  yield return el;
      }


      public static IEnumerable<R> Map<T, R>(IEnumerable<T> e, Func<T, R> mapFunc)
      {
          foreach (T el in e) 
              yield return mapFunc(el);
      }
        ...

score 2 · Answer 8 · answered Nov 25 '08 at 15:46

2

I'm not sure about C#'s implementation of yield(), but on dynamic languages, it's far more efficient than creating the whole collection. on many cases, it makes it easy to work with datasets much bigger than RAM.

answered Nov 25 '08 at 15:46

Javier

60,510
8
78
126

score 1 · Answer 9 · answered Nov 25 '08 at 15:15

Personnally, I haven't found I'm using yield in my normal day-to-day programming. However, I've recently started playing with the Robotics Studio samples and found that yield is used extensively there, so I also see it being used in conjunction with the CCR (Concurrency and Coordination Runtime) where you have async and concurrency issues.

Anyway, still trying to get my head around it as well.

score 1 · Answer 10 · answered Nov 25 '08 at 15:19

Yield is useful because it saves you space. Most optimizations in programming makes a trade off between space (disk, memory, networking) and processing. Yield as a programming construct allows you to iterate over a collection many times in sequence without needing a separate copy of the collection for each iteration.

consider this example:

static IEnumerable<Person> GetAllPeople()
{
    return new List<Person>()
    {
        new Person() { Name = "George", Surname = "Bush", City = "Washington" },
        new Person() { Name = "Abraham", Surname = "Lincoln", City = "Washington" },
        new Person() { Name = "Joe", Surname = "Average", City = "New York" }
    };
}

static IEnumerable<Person> GetPeopleFrom(this IEnumerable<Person> people,  string where)
{
    foreach (var person in people)
    {
        if (person.City == where) yield return person;
    }
    yield break;
}

static IEnumerable<Person> GetPeopleWithInitial(this IEnumerable<Person> people, string initial)
{
    foreach (var person in people)
    {
        if (person.Name.StartsWith(initial)) yield return person;
    }
    yield break;
}

static void Main(string[] args)
{
    var people = GetAllPeople();
    foreach (var p in people.GetPeopleFrom("Washington"))
    {
        // do something with washingtonites
    }

    foreach (var p in people.GetPeopleWithInitial("G"))
    {
        // do something with people with initial G
    }

    foreach (var p in people.GetPeopleWithInitial("P").GetPeopleFrom("New York"))
    {
        // etc
    }
}

(Obviously you are not required to use yield with extension methods, it just creates a powerful paradigm to think about data.)

As you can see, if you have a lot of these "filter" methods (but it can be any kind of method that does some work on a list of people) you can chain many of them together without requiring extra storage space for each step. This is one way of raising the programming language (C#) up to express your solutions better.

The first side-effect of yield is that it delays execution of the filtering logic until you actually require it. If you therefore create a variable of type IEnumerable<> (with yields) but never iterate through it, you never execute the logic or consume the space which is a powerful and free optimization.

The other side-effect is that yield operates on the lowest common collection interface (IEnumerable<>) which enables the creation of library-like code with wide applicability.

All of those are *really* just LINQ though. If you're using .NET 3.5, you'd surely implement GetPeopleWithInitial by returning people.Where(person => person.Name.StartsWith(initial)). — Jon Skeet, Nov 25 '08 at 15:26
well, yes and no. What you are saying is true, but you will have to person => person.Name.Startswith() everywhere. With a library method you get the obvious benefits... yield also comes in .NET 2 whereas not everybody has .NET 3.5 yet... — Pieter Breed, Nov 25 '08 at 15:30
Pieter: I'm not saying you should remove the library methods, but I'd normally implement them with LINQ. And when it's so close to being LINQ, it doesn't really feel like an answer to "when is yield useful outside LINQ" - reimplementing LINQ yourself doesn't count, IMO :) — Jon Skeet, Nov 25 '08 at 15:36
you don't need the yield break since its the last line of the method — Scott Cowan, Nov 25 '08 at 16:19

score 1 · Answer 11 · answered Nov 25 '08 at 17:21

Note that yield allows you to do things in a "lazy" way. By lazy, I mean that the evaluation of the next element in the IEnumberable is not done until the element is actually requested. This allows you the power to do a couple of different things. One is that you could yield an infinitely long list without the need to actually make infinite calculations. Second, you could return an enumeration of function applications. The functions would only be applied when you iterate through the list.

score 0 · Answer 12 · answered Nov 25 '08 at 16:03

I've used yeild in non-linq code things like this (assuming functions do not live in same class):

public IEnumerable<string> GetData()
{
    foreach(String name in _someInternalDataCollection)
    {
        yield return name;
    }
}

...

public void DoSomething()
{
    foreach(String value in GetData())
    {
        //... Do something with value that doesn't modify _someInternalDataCollection
    }
}

You have to be careful not to inadvertently modify the collection that your GetData() function is iterating over though, or it will throw an exception.

score 0 · Answer 13 · answered Dec 02 '08 at 23:32

Yield is very useful in general. It's in ruby among other languages that support functional style programming, so its like it's tied to linq. It's more the other way around, that linq is functional in style, so it uses yield.

I had a problem where my program was using a lot of cpu in some background tasks. What I really wanted was to still be able to write functions like normal, so that I could easily read them (i.e. the whole threading vs. event based argument). And still be able to break the functions up if they took too much cpu. Yield is perfect for this. I wrote a blog post about this and the source is available for all to grok :)

score 0 · Answer 14 · answered Sep 28 '10 at 13:19

The System.Linq IEnumerable extensions are great, but sometime you want more. For example, consider the following extension:

public static class CollectionSampling
{
    public static IEnumerable<T> Sample<T>(this IEnumerable<T> coll, int max)
    {
        var rand = new Random();
        using (var enumerator = coll.GetEnumerator());
        {
            while (enumerator.MoveNext())
            {
                yield return enumerator.Current; 
                int currentSample = rand.Next(max);
                for (int i = 1; i <= currentSample; i++)
                    enumerator.MoveNext();
            }
        }
    }    
}

Another interesting advantage of yielding is that the caller cannot cast the return value to the original collection type and modify your internal collection

Is yield useful outside of LINQ?

14 Answers14

Linked