I made what I'm calling a TreePruner
. Its purpose: given a hierarchy starting at a list of root level nodes, return a new hierarchy where the new root nodes are the highest level nodes that meet a certain condition. Here is my class.
public class BreadthFirstPruner<TResource>
{
private IEnumerable<TResource> originalList;
private IEnumerable<TResource> prunedList;
private Func<TResource, ICollection<TResource>> getChildren;
public BreadthFirstPruner(IEnumerable<TResource> list, Func<TResource, ICollection<TResource>> getChildren)
{
this.originalList = list;
this.getChildren = getChildren;
}
public IEnumerable<TResource> GetPrunedTree(Func<TResource,bool> condition)
{
this.prunedList = new List<TResource>();
this.Prune(this.originalList, condition);
return this.prunedList;
}
private void Prune(IEnumerable<TResource> list, Func<TResource,bool> condition)
{
if (list.Count() == 0)
{
return;
}
var included = list.Where(condition);
this.prunedList = this.prunedList.Union(included);
var excluded = list.Except(included);
this.Prune(excluded.SelectMany(this.getChildren), condition);
}
}
The class does what it's supposed to, but it does so slowly, and I can't figure out why. I've used this on very small hierarchies where the complete hierarchy is already in memory (so there should be no linq-to-sql surprises). But regardless of how eager or lazy I try to make things, the first line of code to actually evaluate the results of a linq expression winds up taking 3-4 seconds to execute.
Here is the code that's currently consuming the pruner:
Func<BusinessUnitLabel, ICollection<BusinessUnitLabel>> getChildren = l => l.Children;
var hierarchy = scope.ToList();
var pruner = new BreadthFirstPruner<BusinessUnitLabel>(hierarchy, getChildren);
Func<BusinessUnitLabel, bool> hasBusinessUnitsForUser = l =>
l.BusinessUnits.SelectMany(bu => bu.Users.Select(u => u.IDGUID)).Contains(userId);
var labels = pruner.GetPrunedTree(hasBusinessUnitsForUser).ToList();
As I stated previously, the dataset that I'm working with when this executes is quite small. It's only a few levels deep with only one node on most levels. As it's currently written, the slowness will occur on the first recursive call to Prune
when I call list.Count()
, because that's when the second level of the hierarchy (excluded.SelectMany(this.getChildren)
) is being evaluated.
If, however, I add a .ToList
call like so:
var included = list.Where(condition).ToList()
Then the slowness will occur at that point.
What do I need to do to make this thing go fast?
Update
After someone prompted me to reevaluate my condition more carefully, I realized that those associations in hasBusinessUnitsForUser
were not being eager loaded. That there was the problem.