3

Imagine an object with the properties:

        class TestObject
        {
            public string Name { get; set; }
            public Collection<TestObject> Children { get; set; }
        }

Now initialize some in a jagged fashion:

var person1 = new TestObject(){ 
                          Name = "Joe", 
                          Children = new Collection<TestObject>(){ childCollection1 }; 
                              };

var person2 = new TestObject(){ 
                          Name = "Mary", 
                          Children = new Collection<TestObject>(){ childCollection2 }; 
                              };

Where Joe's childCollection is only one level deep, but Mary's children have children, who also have children.

I have attempted to use SelectMany with no luck.

// Works
var joe = person1.Children.SelectMany(c => c.Children).Concat(person1.Children);

// Does not work - only returns 1 level deep
var mary = person2.Children.SelectMany(c => c.Children).Concat(person2.Children);

What is the best way to retrieve a result containing every child, to an unknown depth?

leppie
  • 115,091
  • 17
  • 196
  • 297
Neil.Allen
  • 1,606
  • 1
  • 15
  • 20

2 Answers2

5

Helper method

public static IEnumerable<T> Traversal<T>(
    T root,
    Func<T, IEnumerable<T>> getChildren)
{
    if (root == null)
    {
        yield break;
    }

    yield return root;

    var children = getChildren(root);
    if (children == null)
    {
        yield break;
    }

    foreach (var child in children)
    {
        foreach (var node in Traversal(child, getChildren))
        {
            yield return node;
        }
    }
}

//Or if you don't need all those null checks, here's a more compact version.
public static IEnumerable<T> Traversal<T>(
    T root,
    Func<T, IEnumerable<T>> getChildren)
{
    yield return root;
    foreach (var child in getChildren(root))
        foreach (var node in Traversal(child, getChildren))
            yield return node;
}

//If you like a LINQ/functional style better, this is also equivalent.
public static IEnumerable<T> Traversal<T>(
    T root,
    Func<T, IEnumerable<T>> getChildren)
{
    return new T[] { root }
        .Concat(getChildren(root)
            .SelectMany(child => Traversal(child, getChildren)));
}

Usage

var everybody = Traversal(person, x => x.Children);

Comments

You can easily modify the Traversal method to behave exactly the way you want. For example, if you only want leaf nodes, then you should only yield return root; when children is null or empty.

Performance concerns

If performance is any kind of issue, consider the LINQ/functional implementation above or take a look at Servy's answer, either of which should be more efficient than the version using yield ....

Timothy Shields
  • 75,459
  • 18
  • 120
  • 173
  • 2
    Recursive iterator blocks should really be avoided. Unlike traditional methods, they are quite a bit "heavier" and have a lot more associated state and construction/destruction associated with them. Writing a non-recursive traversal function instead will have fairly noticeable performance improvements. – Servy Aug 15 '13 at 20:13
  • @Servy Point well taken. However, the OP may not be at all interested in performance here. I would personally start with this implementation and only change it if I find it's too slow. – Timothy Shields Aug 15 '13 at 20:17
  • The difference is that this isn't designed to be a one time use function; it's designed to be something generic that you use in a lot of places. That's something that is a lot more likely to be worth optimizing. That the optimization actually makes the method simpler, easier to work with, and more flexible, in addition to being more performant, makes it all the easier of a decision to make. – Servy Aug 15 '13 at 20:19
  • @Servy The optimization doesn't make the method simpler. The reason your method might look simpler is because the `null` checks I'm doing aren't present in yours. Mine could be written in 4 lines if the `null` checks are removed. Take a look at the edited answer. – Timothy Shields Aug 15 '13 at 20:22
  • @Servy Also, the third implementation I have above (using functional LINQ) should be more efficient. – Timothy Shields Aug 15 '13 at 20:27
  • It would not be, for the exact same reasons your original method isn't. You're still constantly re-creating iterator objects that are the same as others, but with a few more items. Creating those objects aren't cheap (compared to the expense of the other operations being performed) so it can easily have a noticeable performance implication. – Servy Aug 15 '13 at 20:29
  • Thanks, and I love the LINQ implementation - doing some performance tests – Neil.Allen Aug 15 '13 at 21:33
5

You can write a generic traverse method like this:

public static IEnumerable<T> Traverse<T>(T root, 
    Func<T, IEnumerable<T>> childSelector)
{
    var stack = new Stack<T>();
    stack.Push(root);

    while (stack.Any())
    {
        var next = stack.Pop();
        yield return next;
        foreach (var child in childSelector(next))
            stack.Push(child);
    }
}

This is a general model that's useful for traversing trees in general. Note that this will do a depth first search. If you want a breath first search you would use a Queue<T> instead of a Stack<T>.

Servy
  • 202,030
  • 26
  • 332
  • 449
  • Very interesting using the Stack - working on some performance numbers to compare the two answers. Thanks. – Neil.Allen Aug 15 '13 at 21:34
  • 1
    In my performance tests, I got 10-13% better performance with this over the LINQ/Recursive methods. While performance was not part of the original question, it was worth noting. – Neil.Allen Aug 16 '13 at 01:58