1

I am looking for a better way of sorting the following type of data. The below works fine for smaller data sets (on some systems its 2000 on other 9000) but causes a stackoverflow when processing larger ones

The structure in place that holds the data looks like this

public class AttributeItem
{
    public string AttributeType { get; set; }
    public string Title { get; set; }
    public string Value { get; set; }
    public int ObjectID { get; set; }
    public bool CanModify { get; set; }
    public bool CanDelete { get; set; }
    public bool? IsParent { get; set; }
    public int SortID { get; set; }
    public int? ParentSortID { get; set; }
    public bool Deleted { get; set; }
}

public class AttributeItemNode
{
    public AttributeItem Item {get;set;}
    public int Depth {get;set;}

    public AttributeItemNode(AttributeItem item , int Depth)
    {
        this.Item = item ;
        this.Depth = Depth;
    }
}

Here is an example of the data needing to be sorted into a single object with an int indicating their depth. It is possible for the children levels to go deeper than the three level shown in the example data

var items = new List<AttributeItem>();
items.Add(new AttributeItem{Title ="Parent1", ObjectID=1,SortID =1, IsParent= true, ParentSortID = Int32.MinValue});

items.Add(new AttributeItem{Title ="FooChild", ObjectID=2,SortID =2, IsParent= false, ParentSortID = 1});

items.Add(new AttributeItem{Title ="Parent2", ObjectID=4,SortID =4, IsParent= true, ParentSortID = Int32.MinValue});

items.Add(new AttributeItem{ Title ="Parent2Child1", ObjectID=5,SortID =5, IsParent= false, ParentSortID = 4});

items.Add(new AttributeItem{Title ="Parent2Child2", ObjectID=7,SortID =7, IsParent= false, ParentSortID = 4});

items.Add(new AttributeItem{Title ="Parent2Child2Child1", ObjectID=6,SortID =6, IsParent= false, ParentSortID = 5});

The expected output would be as follows (I have removed the irrelevant data from the object to help readability)

Depth = 0 Title ="Parent1"
Depth = 1 Title ="FooChild" 
Depth = 0 Title ="Parent2"
Depth = 1 Title ="Parent2Child1" 
Depth = 2 Title ="Parent2Child2Child1"
Depth = 1 Title ="Parent2Child2"

Here is the actual sorting code

    public static IList<AttributeItemNode> SortAttributeItems(IList<AttributeItem> list)
    {
        List<AttributeItemNode> newList = new List<AttributeItemNode>();
        SortAttributeItems(list, null, 0, newList);

        return newList;
    }

    private static void SortAttributeItems(IList<AttributeItem> list, AttributeItem currentItem, int depth, List<AttributeItemNode> newList)
    {
        AttributeItem newItem = null;
        // look for children
        if (currentItem != null)
        {
            foreach (AttributeItem item in list)
            {
                if (item.ParentSortID.HasValue && item.ParentSortID.Value != Int32.MinValue && item.ParentSortID.Value == currentItem.SortID)
                {
                    newList.Add(new AttributeItemNode(item, (depth + 1)));
                    SortAttributeItems(list, item, depth + 1, newList); 
                }
            }
        }

        if (depth == 0)
        {
            foreach (AttributeItem item in list)
            {
                if (!item.ParentSortID.HasValue || item.ParentSortID.Value == Int32.MinValue) 
                {
                    if (currentItem == null || item.SortID >= currentItem.SortID) 
                    {
                        if (newItem == null || newItem.SortID >= item.SortID)
                        {
                            newItem = item;
                        }
                    }
                }
            }
        }

        if (newItem != null)
        {
            newList.Add(new AttributeItemNode(newItem, depth));
            list.Remove(newItem);
            SortAttributeItems(list, newItem, depth, newList);
        }

    }
Kevin Kunderman
  • 2,036
  • 1
  • 19
  • 30
  • I see your point but as stated in the question. It fails with larger amounts of data. So there is more then just looking for optimizations. If it just ran really slow with larger data sets then I would agree. – Kevin Kunderman Oct 01 '16 at 02:11
  • 1
    Then edit your title to ask for something other than *improvement*, which means *make something that works better*, and your first paragraph, which asks about making it *better*, which also says *it works, but I want to improve it*. – Ken White Oct 01 '16 at 02:12
  • Thanks. My close vote has been withdrawn. :-) – Ken White Oct 01 '16 at 02:33
  • As I see `depth` is also calculated, based on hierarchy / data available, that's not supplied by default – Mrinal Kamboj Oct 01 '16 at 02:39
  • @MrinalKamboj The default value for `depth` is 0 after that it is calculated `depth+1` as we move down the hierarchy – Kevin Kunderman Oct 01 '16 at 02:52
  • If you know the size of the result, I recommend using array or if possible in-place sort https://msdn.microsoft.com/en-us/library/w56d4y5z – Slai Oct 01 '16 at 03:01
  • Is your data guaranteed to represent a tree, or is it a more general directed acyclic graph? As written if you had a loop in the data what would happen? – Ian Mercer Oct 01 '16 at 03:01
  • 1. Use an optimized implementation such as the ones by Microsoft. 2. If you do have to implement your own, avoid recursion. – Lex Li Oct 01 '16 at 05:17
  • @KevinKunderman Check out the version I have posted which use the combination of Linq, Data structures and recursion to achieve the solution – Mrinal Kamboj Oct 01 '16 at 10:15
  • 1
    Please state (close to the beginning of the post) _why_ you roll your own sorting routine instead of using an existing one. – greybeard Oct 01 '16 at 11:40

3 Answers3

2

The problem can be solved efficiently w/o using recursion. It can be split on two parts - create a tree structure and flatten the tree using iterative pre-order Depth First Traversal, sorting each level.

For the first part we can use LINQ ToLookup method to create a fast lookup structure by ParentSortID in O(N) time.

For the second part, following the DRY principle I will use the general purpose method from my answer to How to flatten tree via LINQ? by creating an overload which allows projecting to a custom result from item and depth (which as you can see I already have):

public static class TreeUtils
{
    public static IEnumerable<TResult> Expand<T, TResult>(
        this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector, Func<T, int, TResult> resultSelector)
    {
        var stack = new Stack<IEnumerator<T>>();
        var e = source.GetEnumerator();
        try
        {
            while (true)
            {
                while (e.MoveNext())
                {
                    var item = e.Current;
                    yield return resultSelector(item, stack.Count);
                    var elements = elementSelector(item);
                    if (elements == null) continue;
                    stack.Push(e);
                    e = elements.GetEnumerator();
                }
                if (stack.Count == 0) break;
                e.Dispose();
                e = stack.Pop();
            }
        }
        finally
        {
            e.Dispose();
            while (stack.Count != 0) stack.Pop().Dispose();
        }
    }
}

And here is the implementation of the method in question:

public static IList<AttributeItemNode> SortAttributeItems(IList<AttributeItem> list)
{
    var childrenMap = list.ToLookup(e => e.ParentSortID ?? int.MinValue);
    return childrenMap[int.MinValue].OrderBy(item => item.SortID)
        .Expand(parent => childrenMap[parent.SortID].OrderBy(item => item.SortID),
            (item, depth) => new AttributeItemNode(item, depth))
        .ToList();
}
Community
  • 1
  • 1
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • A doubt, In `Expand` method call, where you are supplying `resultSelector` as follows `(item, depth) => new AttributeItemNode(item, depth)`, where does the depth value comes from, since that is part of ` – Mrinal Kamboj Oct 02 '16 at 02:12
  • @MrinalKamboj It comes from the `Expand` method (similar to `depth` variable in your recursive method). Since I'm using an explicit `stack` inside, the `stack.Count` is the current depth. – Ivan Stoev Oct 02 '16 at 03:16
  • This worked perfectly once I changed items.ToLookup(e => e.ParentSortID ?? int.MinValue); to list.ToLookup(e => e.ParentSortID ?? int.MinValue); – Kevin Kunderman Oct 03 '16 at 13:51
0

Any reason you can't simply follow the Parent pointers to calculate the depth?

If you put it in a Dictionary<int,AttributeItem> map with the SortId as the key you can now take each AttributeItem item and do:

int depth = 0;
var current = item;
while (!current.IsParent)
{ 
   depth++;
   current = map[current.ParentSortId;
}

If you used one of the many Nuget packages for trees or graphs you could do this and many other graph operations on your data including checking that it's valid and contains no cycles.

It's also a good idea not to have the same information represented two ways: you have IsParent but you also have a marker value on the ParentSortId. What if these don't agree? etc.

Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
  • You can assign depth to the child by traversing to the `Parent`, but still recursion would be required to create the hierarchy as expected. In fact the solution I have provide use a similar Dictionary based strategy – Mrinal Kamboj Oct 01 '16 at 10:18
0
public class AttributeItemNode : IComparable<AttributeNode> {

    public int CompareTo(AttributeItemNode other) {
        // compare the Ids in appropriate order
    }
}

public class NodeCollection {
    protected List<AttributeItemNode> nodes;

    public void AddNode() { }

    public void Sort() { 
       nodes.Sort();
       this.CalcDepth();
    }

    protected void CalcDepth {
        foreach (var node in nodes)
          if (node.IsParent) { node.Depth = 0; break; }

          //use the various Ids that are now in sorted order
          // and calculate the item's Depth.
    }
}

An AttributeItem already has all it needs for sorting. Use IsParent (maybe?), SortId, and ParentSortId to implement CompareTo() above.

Calc Depth only after sorting, this avoids need for recursion.

Then:

myNodeCollection.Sort()

List.Sort() .NET intelligently decides which of several sort algorithms to use.

radarbob
  • 4,964
  • 2
  • 23
  • 36
  • This will surely not solve what OP has asked for, First part of linking Children to Parents in the `AttributeItem List` is not done and then OP has asked items to be listed in specific sub-hierarchies instead of generic sorted order, which is what your solution will achieve using `IComparable`, as it cannot do it for each sub-part – Mrinal Kamboj Oct 01 '16 at 10:10
  • The question is about avoiding recursion-caused stack overflow and calculating depth. In-keys-order is all that is needed. If some kind of tree or linked list is desired then just traverse the sorted list and build it. Even then recursion is avoided. This solution, then, has a chance of handling a large data volume. – radarbob Oct 01 '16 at 19:11
  • Please try to implement a solution based on your suggestion, you will understand where it would not work as expected, try on the data provided by the OP. The arrangement of `Parents` and `Children` by using a Sort using `IComparer`, is over simplifying the problem, it would not result in the expected Tree structure – Mrinal Kamboj Oct 02 '16 at 02:08