1

I have implemented the following hierarchy interface

interface ITree<T>
{
    // Incremential unique key
    int Id { get; set; }
    string Name { get; set; }
    // Hierarchy classical pattern
    T Parent { get; set; }
    ICollection<T> Children { get; set; }
    // Computed values
    int Depth { get; }
    // Hierarchy path with dotted notation (e.g.: 12.45.554.23,
    // where each path segment is an Id)
    string Path { get; set; }
}

class TreeItem : ITree<TreeItem>
{
    public int Id { get; set; }
    public string Name { get; set; }
    public TreeItem Parent { get; set; }
    public ICollection<TreeItem> Children { get; set; }
    public string Path { get; set; }    
    public int Depth { get { return Path.Split('.').Length - 1; } }
}

These items are stored and retrieved through Entity Framework, so we can assume that all the relation fields are not null and consistent :

  • Path and Depth are always up to date
  • Chaining works (e.g. : item.Parent?.Parent?.Parent)
  • Traversing the Children field is also working recursively.
  • Using Path and Depth is the preferred approach, since it does not need to compute on relation fields.

Consider I have the following hierarchy :

- A (Depth = 0)
-- B (Depth = 1)
-- C (Depth = 1)
- D  (Depth = 0)
-- E (Depth = 1)

All my elements are in a unordered flat array, let's say [D,C,B,E,A]. I want to use a Linq expression to sort them out the following way:

  • First level 0, according to its name field
  • All level 1 children of the previous, sorted by name
  • Second level 0 (still according to its name field)
  • All level 1 children of the previous...

The example is given for 2 levels of depth, but I would like the expression to traverse a hierarchy whatever its depth.

Note that the level and path fields of my data structure can be used to achieve this, since all the paths of the tree are rebuilt whenever an item is added, moved or removed, and the Depth field is computed with a simple Split('.') on the path.

Test sample :

var A = new TreeItem { Id = 1, Name = "A", Path = "1" };
var B = new TreeItem { Id = 2, Name = "B", Path = "1.2", Parent = A };
var C = new TreeItem { Id = 3, Name = "C", Path = "1.3", Parent = A };
var D = new TreeItem { Id = 4, Name = "D", Path = "4" };
var E = new TreeItem { Id = 5, Name = "E", Path = "4.5", Parent = D };

// populate children for the example.
// My actual code is automatic thanks to EF Inverse Relationship.
A.Children = new List<TreeItem> { B, C };
D.Children = new List<TreeItem> { E };

var listToSortHierarchically = new List<TreeItem> { D, C, B, E, A };
// I want the result of the hierarchical sort to be A B C D E
kall2sollies
  • 1,429
  • 2
  • 17
  • 33
  • What is `T`? Could you provide a sample implementation. And shouldn't it at least be constrained to be `where T : ITree` – Ivan Stoev Mar 10 '16 at 16:50
  • I added a sample implementation. Basically, T is a POCO object, and the Tree interface ensures it facades with Id, Name, Parent, Children, Depth and Path fields. – kall2sollies Mar 10 '16 at 17:20
  • Thank you. It's not compiling, but anyway. How about the constraint? I mean, if I have a `ITree node`, can I use `node.Parent.Parent` etc? W/o a constraint, I need to cast? – Ivan Stoev Mar 10 '16 at 17:23
  • Ok, quick edit to have a model that compiles, and answer your questions, but I will provide a code sample to setup the hierarchy. – kall2sollies Mar 10 '16 at 17:41

1 Answers1

2

Ok, first you should really add the following constraint

interface ITree<T>
    where T : class, ITree<T>
{
    // ...
}

so we can safely navigate the hierarchy using Parent and Children properties w/o casting.

Second, instead of reinventing the wheel, I'll reuse the general tree in order traversal helper method from my answer to How to flatten tree via LINQ? (and couple others):

public static partial class TreeUtils
{
    public static IEnumerable<T> Expand<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector)
    {
        var stack = new Stack<IEnumerator<T>>();
        var e = source.GetEnumerator();
        try
        {
            while (true)
            {
                while (e.MoveNext())
                {
                    var item = e.Current;
                    yield return item;
                    var elements = elementSelector(item);
                    if (elements == null) continue;
                    stack.Push(e);
                    e = elements.GetEnumerator();
                }
                if (stack.Count == 0) break;
                e.Dispose();
                e = stack.Pop();
            }
        }
        finally
        {
            e.Dispose();
            while (stack.Count != 0) stack.Pop().Dispose();
        }
    }
}

With that helper in the pocket, the method in question is simple as that:

partial class TreeUtils
{
    public static IEnumerable<T> Ordered<T>(this IEnumerable<T> source, Func<IEnumerable<T>, IEnumerable<T>> order = null)
        where T : class, ITree<T>
    {
        if (order == null) order = items => items.OrderBy(item => item.Name);
        return order(source.Where(item => item.Parent == null))
            .Expand(item => item.Children != null && item.Children.Any() ? order(item.Children) : null);
    }
}

Sample usage:

List<TreeItem> flatList = ...;
var orderedList = flatList.Ordered().ToList();

UPDATE: Here is the same by using only Path and Id properties:

public static partial class TreeUtils
{
    public static IEnumerable<T> Ordered<T>(this IEnumerable<T> source, Func<IEnumerable<T>, IEnumerable<T>> order = null)
        where T : class, ITree<T>
    {
        if (order == null) order = items => items != null && items.Any() ?  items.OrderBy(item => item.Name) : null;
        var chldrenByParentId = source
            .Select(item => new { item, path = item.Path.Split('.') })
            .ToLookup(e => e.path.Length >= 2 ? int.Parse(e.path[e.path.Length - 2]) : (int?)null, e => e.item);
        return (order(chldrenByParentId[null]) ?? Enumerable.Empty<T>())
            .Expand(item => order(chldrenByParentId[item.Id]));
    }
}
Community
  • 1
  • 1
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • Thanks for your precise answer... But actually, I was expecting an answer only involving the Path field (and the Depth, which comes from the Path). Your answer is of course brilliant and generic upon traversing and flattening or rebuilding trees, given the fact that we actually have Parent and Children field populated. But I would like to avoid using these, if possible, because for performance issues, I may eventually decide not to have them populated in my flat list to sort. Hope I am clear enough ! – kall2sollies Mar 10 '16 at 18:03
  • As you wish. It's doable, but IMO splitting and parsing the path and linear searching the flat list will be quite inefficient. – Ivan Stoev Mar 10 '16 at 18:12
  • Ok so I will investigate on how to setup your approach in my case. I must confess I have no theoretical knowledge on tree traversal and lookup algorithms, and I should improve that ! – kall2sollies Mar 10 '16 at 18:36
  • Anyway, for your convenience provided the alternative method not using `Parent`/`Children`. Enjoy :) – Ivan Stoev Mar 10 '16 at 18:49