How does using yield save time or memory?

Question

I'm new to C#, have not seen the equivalent of yield in previous languages I've tried to learn, and am not convinced that it is helpful except perhaps for readability. I survived all these years without it, so why do I need it?

As I undersand, you can use yield return to spit out values of type T one-by-one rather than collecting those values into an IEnumerable<T> and spitting that whole collection out at the end. What's the point? After all, I'm sure there is some overhead involved in interrupting the execution of the function to copy out a single value. Perhaps I'll run some performance tests to see if it's more efficient in terms of time. More than that, I'm wondering if you can show me a specific situation where I would need to iterate through a set of values collected by a function and can only do it with yield or would be better off doing it with yield.

Possible duplicate of [What is the yield keyword used for in C#?](http://stackoverflow.com/questions/39476/what-is-the-yield-keyword-used-for-in-c) — Turn, Dec 30 '15 at 06:50
It doesn't necessarily save either. It can reduce cognitive overhead for developers though - by simplifying the code they have to write. — Damien_The_Unbeliever, Dec 30 '15 at 07:47

Alexey · Answer 1 · 2015-12-30T08:49:54.537

The idea is to generate the values on the fly. Your collection of values might be infinite or the cost of generating each value might be high. When you foreach through an IEnumerable, you are actually calling methods on IEnumerator, which can be implemented in any way you like. A function that uses yield is automatically reimplemented as an IEnumerator that generates values only when they are requested. When you want to generate values on the fly as well, you also have to code an implementation of IEnumerator just like the one a yielding function is replaced with.

Some specific situations where using a generator might be preferable to creating and returning a collection:

searching a very large file line-by-line. You don't want to load several gigabytes of text into memory, so it makes sense to read one line and yield return it. You can write a loop, of course, but by extracting the logic into a generator you can easily replace the file with a database table or a file in a different format, for example
walking a tree. You can use a visitor to walk a tree, or you can use a generator to generate a sequence of nodes in the right order, two approaches are inversions of one another. NB: recursive generators are a bad idea in C#!
generating infinite data for testing purposes where each successive element uses previous elements to generate itself ("On the 1298456th day of Christmas my true love sent me..." is a trivial example, you don't need to store 1298455 days worth of presents, just the list of previous presents and the current day)

Basically, in every case where you do not have to worry about handling IEnumerable as ICollection, i.e. you treat is as a stream of values, not as a finite bag of values with a Count, you might save time or memory by using a generator.

I kindly suggest you to provide an *example* for (potentially) *infinite* length collection, e.g. a sequence `1, 2, 3, ...`, a collection with *high overhead* (which can be emulated with `Thread.Sleep`) makes your answer better as well. — Dmitry Bychenko, Dec 30 '15 at 07:16

Amit · Accepted Answer · 2015-12-31T08:41:32.233

4

As a marquee example of iterator usage, consider a number series iterator:

IEnumerable<int> fibo() {
  int cur = 0, next = 1;
  while(true) {
    yield return cur;
    next += cur;
    cur = next - cur;
  }
}

Now we can choose what to do with the series, and only the required elements are calculated:

var fibs = fibo();
var sumOfFirst10Fibs = fibs.Take(10).Sum();

Another useful pattern is flattening a complex data structure, like a tree¹:

public class Tree<T> {
    public Tree<T> Left, Right;
    public T value;

    public IEnumerable<T> InOrder() {
      if(Left != null) {
        foreach(T val in Left.InOrder())
          yield return val;
      }

      yield return value;

      if(Right != null) {
        foreach(T val in Right.InOrder())
          yield return val;
      }
    }
  }
}

_{1 As noted by Alexey in the comments, the in-order traversal is inefficient (particularly when tall trees are traversed).}

edited Dec 31 '15 at 08:41

answered Dec 30 '15 at 07:41

Amit

45,440
9
78
110

Your example of flattening a tree is clear, but inefficient. Recursive generators are O(n^2), see http://blogs.msdn.com/b/wesdyer/archive/2007/03/23/all-about-iterators.aspx – Alexey Dec 30 '15 at 21:22
@Alexey - being clear was the intention, not being efficient. But I'm puzzled by your statement "*Recursive generators are O(n^2)*". What do you mean by that (couldn't see that mentioned in the link either)? – Amit Dec 30 '15 at 21:32
2

When you call a generator from another generator, you have two generators to traverse and two calls of MoveNext are necessary to get to the first element, two calls to get to the second, and one to stop. When you have three generators strung together, you need three calls to get to the first element (true, true, true), then three calls to get to the second (false, true, true), then two to get to the last (false, true), and one more to stop (false). With N generators calling each other you get the amount of calls proportional to N*N. Trees are a bit different, but the result is similar. – Alexey Dec 31 '15 at 07:47
@Alexey - I understand you now. So your point is that each nested iterator adds a function call when the outer scope calls `MoveNext`. I'll edit the answer to include a relevant comment about this. Thanks! – Amit Dec 31 '15 at 08:38

score 0 · Answer 3 · edited Jan 04 '16 at 08:38

The MSDN covers a lot to it:

When you use the yield keyword in a statement, you indicate that the method, operator, or get accessor in which it appears is an iterator. Using yield to define an iterator removes the need for an explicit extra class (the class that holds the state for an enumeration, see IEnumerator(Of T) for an example) when you implement the IEnumerable and IEnumerator pattern for a custom collection type.

Technical Implementation

The following code returns an IEnumerable<string> from an iterator method and then iterates through its elements.
IEnumerable<string> elements = MyIteratorMethod();
foreach (string element in elements)
{
   …
}
The call to MyIteratorMethod doesn't execute the body of the method. Instead the call returns an IEnumerable<string> into the elements variable.
On an iteration of the foreach loop, the MoveNext method is called for elements. This call executes the body of MyIteratorMethod until the next yield return statement is reached. The expression returned by the yield return statement determines not only the value of the element variable for consumption by the loop body but also the Current property of elements, which is an IEnumerable<string>.
On each subsequent iteration of the foreach loop, the execution of the iterator body continues from where it left off, again stopping when it reaches a yield return statement. The foreach loop completes when the end of the iterator method or a yield break statement is reached.

score 0 · Answer 4 · answered Dec 30 '15 at 07:24

yield can be useful in a scenario where the Collection you want to return is not yet ready. i.e you are building up the list while iterating. By using yield-return, you really only need to have the next item before returning. Another case where yield-return is preferable is if the IEnumerable represents an infinite set. Consider the list of Prime Numbers, or an infinite list of random numbers. You can never return the full IEnumerable at once, so you use yield-return to return the list incrementally.

How does using yield save time or memory?

4 Answers4