Performance of nested yield in a tree

Question

I've got a tree-like structure. Each element in this structure should be able to return a Enumerable of all elements it is root to. Let's call this method IEnumerable<Foo> GetAll(). So if we have

     A  <-- topmost root
    / \ 
   /   \
  B     C
 / \   / \
D   E F   G

a call to GetAll on element C returns {C, F, G} (fixed order of elements would be nice, but is not needed). I guess everybody knew that already.

The current implementation of GetAll looks like this:

public IEnumerable<Foo> GetAll ()
{
    yield return this;

    foreach (Foo foo in MyChildren) {
        foreach (Foo f in foo.GetAll ()) {
            yield return f;
        }
    }
}

In an earlier implementation, I returned a List and added the child-foos using List.AddRange().

My question is if the version using yield is correcly implemented or if it should be improved (esp. in terms of performance). Or is this just bad and I should stick to Lists (or ReadOnlyCollections) instead?

score 21 · Answer 1 · answered Jun 25 '09 at 11:13

21

You can improve performance if you unroll recurse to stack, so you will have only one iterator:

public IEnumerable<Foo> GetAll()
{
    Stack<Foo> FooStack = new Stack<Foo>();
    FooStack.Push(this);

    while (FooStack.Count > 0)
    {
        Foo Result = FooStack.Pop();
        yield return Result;
        foreach (Foo NextFoo in Result.MyChildren)
            FooStack.Push(NextFoo);
    }
}

answered Jun 25 '09 at 11:13

arbiter

9,447
1
32
43

Nope, only one yielded iterator and one MyChildren iterator per node, while original solutions having one yielded iterator per node, and one MyChildren iterator per node, plus recursion. – arbiter Jun 25 '09 at 11:24
Thank you, I actually used this design in my code. I'm still marking Jon's answer as accepted since his link contain the same idea and he posted earlier. Hope you don't mind. o,o – mafu Jul 01 '09 at 10:47
Having had to do similar types of implementation workarounds, I've found that `List` ends up having a lot better performance than `Queue` and `Stack`. Also the ability to do `List.AddRange(...)` is a lot faster than a `foreach(...) { QueueEnqueue(...) }`. – mckamey Jul 20 '10 at 03:27
@mckamey were you using `.Count()` on the `Queue` implementation because that could also explain the difference with `List`? Likewise if the internal arrays have to keep growing that would also explain the performance difference. – Seph Apr 24 '14 at 09:03
One thing to note about this approach is that it does not return items in the same order as the original implementation. It returns children in reverse order since the last child `Push`ed is the first one `Pop`ed. Using `Queue` instead of `Stack` results in breadth-first traversal of the tree. – Jack A. Nov 14 '17 at 23:54

Jon Skeet · Accepted Answer · 2009-06-25T10:29:03.473

11

It's certainly not ideal in terms of performance - you end up creating a lot of iterators for large trees, instead of a single iterator which knows how to traverse efficiently.

Some blog entries concerning this:

Wes Dyer: All about iterators
Eric Lippert: Immutability in C#, part 6
Eric again: Immutability in C#, part 7

It's worth noting that F# has the equivalent of the proposed "yield foreach" with "yield!"

edited Jun 25 '09 at 10:29

answered Jun 25 '09 at 10:16

Jon Skeet

1,421,763
867
9,128
9,194

Bevan · Answer 3 · 2009-06-26T08:26:26.587

2

A better solution might be to create a visit method that recursively traverses the tree, and use that to collect items up.

Something like this (assuming a binary tree):

public class Node<T>
{
    public void Visit(Action<T> action)
    {
        action(this);
        left.Visit(action);
        right.Visit(action);
    }

    public IEnumerable<Foo> GetAll ()
    {
        var result = new List<T>();
        Visit( n => result.Add(n));
        return result;
    }
}

Taking this approach

Avoids creating large numbers of nested iterators
Avoids creating any more lists than necessary
Is relatively efficient
Falls down if you only need part of the list regularly

edited Jun 26 '09 at 08:26

answered Jun 25 '09 at 10:22

Bevan

43,618
10
81
133

It also takes up O(n) space instead of O(1) space - so it's efficient in terms of computation but not memory. – Jon Skeet Jun 25 '09 at 10:30
You should use a foreach instead of just left and right. He didn't specify it's a btree. – Maghis Jun 25 '09 at 11:09
Yes, any node contains 0+ children. But that's not too much of a difference anyway. – mafu Jun 25 '09 at 15:17

score 1 · Answer 4 · answered Jun 25 '09 at 10:13

1

No, that looks fine.

Have a look at my blog entry, it may be of some use :)

answered Jun 25 '09 at 10:13

leppie

115,091
17
196
297

score -2 · Answer 5 · answered Jun 25 '09 at 10:15

-2

According to my prior experiences using yield is a lot more effective than creating a List. If you're using .NET 3.5, this implementation should be fine. But don't forget the

yield break;

At the end. :-)

answered Jun 25 '09 at 10:15

ShdNx

3,172
5
40
47

Um, why would you want yield break at the end in this case? – Jon Skeet Jun 25 '09 at 10:16
2

Why do you need this at the end? I thought that the enumerator automatically finished when the Enumerable method exited ... – Bevan Jun 25 '09 at 10:18
Hmm, perhaps I misunderstood something regarding the use of yield. As I remember I got an error if I didn't close the method with yield break;. I'm sorry if I said something stupid! Gonna look into that matter... – ShdNx Jun 25 '09 at 10:20
Right, the IEnumator.MoveNext() automatically breaks if it gets out of scope. But it still shouldn't hurt using yield break, but you're right, it's not required. – ShdNx Jun 25 '09 at 10:25

Performance of nested yield in a tree

5 Answers5

Linked

Related