1

I have a fairly large collection of foo { int id, int parentid, string name}.

I am looking to collect a list of foo objects where the object has a name of "bar3", and is a child of an object named "bar2", which is a child of an object with an ID of 1.

What sort of collection should I use (I've been playing with lookups and dictionaries with not a whole lot of success) and how should I write this to make an efficient function out of this? There are approximately 30K foo objects and my method is choking to death.

Thanks!

Jeremy Holovacs
  • 22,480
  • 33
  • 117
  • 254
  • 1
    It sounds like you are trying to implement a tree data structure. What do you mean by more efficient? Whats choking? Are you running out of memory or are you trying to do a look-up. – Juan Ayala Aug 08 '12 at 13:17
  • Definitely a tree data structure, but the problem is it hangs for 3-5 minutes of each iteration when I try to run this, and since this runs several hundred times, it needs to be more efficient. I'm trying to use indices, but the concept of a compound index does not seem to exist in C# collections. – Jeremy Holovacs Aug 08 '12 at 13:22
  • If it's a tree data structure, why not create a tree data structure? Why have `parentID` when you can have `children`? – Jon Hanna Aug 08 '12 at 13:30
  • What kind of query are you using, what's the exact data structure? – AD.Net Aug 08 '12 at 13:30
  • The structure is not mutable; I am linking into an API with defined objects. It defines an id, a parentid, and a name, among other things. A parentid and a name are theoretically a unique key, as is the id. – Jeremy Holovacs Aug 08 '12 at 14:32

3 Answers3

3

If I really had to stick with this layout for foo, and I really had to make lookups as fast as possible (I don't care about memory size, and will be reusing the same objects repeatedly, so the cost of setting up a set of large structures in memory would be worth it), then I would do:

var byNameAndParentLookup = fooSource.ToLookup(f => Tuple.Create(f.parentid, f.name)); //will reuse this repeatedly
var results = byNameAndParentLookup[Tuple.Create(1, "bar2")].SelectMany(f => byNameAndParentLookup[Tuple.Create(f.id, "bar3")]);

That said, if I was going to store tree data in memory, I'd prefer to create a tree-structure, where each foo had a children collection (perhaps a dictionary keyed on name).

Edit: To explain a bit.

fooSource.ToLookup(f => Tuple.Create(f.parentid, f.name))

Goes through all the items in fooSource (wherever our foo objects are coming from), and for each one creates a tuple of the parentid and the name. This is used as a key for a lookup, so for each parentid-name combination we can retrieve 0 or more foo objects with that combo. (This will use the default string comparison, if you want something else such as case-insensitive, create an IEqualityComparer<Tuple<int, string>> implementation that does the comparison you want and use .ToLookup(f => Tuple.Create(f.parentid, f.name), new MyTupleComparer())).

The second line can be broken down into:

var partWayResults = byNameAndParentLookup[Tuple.Create(1, "bar2")];
var results = partWayResults.SelectMany(f => byNameAndParentLookup[Tuple.Create(f.id, "bar3")]);

The first line simply does a search on our lookup, so it returns an enumeration of those foo objects which have a parentid of 1 and a name of "bar2".

SelectMany takes each item of an enumeration or queryable, and computes an expression that returns an enumeration, which is then flattened into a single enumeration.

In other words, it works a bit like this:

public static SelectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> func)
{
  foreach(TSource item in source)
    foreach(TResult producedItem in func(item))
      yield return producedItem;
}

In our case, the expression passed through takes the id of the element found in the first lookup, and then looks for any elements that have that as their parentid and have the name "bar2".

Hence, for every item with parentid 1 and name bar2, we find every item with that first item's id as its parentid and the name bar3. Which is what was wanted.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
0

Check this out: QuickGraph I've never actually used it but it seems well documented. Alternatively you can try the C5 Generic Collection Library

I got this from this tread

Community
  • 1
  • 1
Juan Ayala
  • 3,388
  • 2
  • 19
  • 24
-1

I can suggest you to group all items by parentId first then apply conditions on it. First you will need to find group with bar1 element, than you should select all its childs and try to find element with name bar 2...

I can suggest such solution, it not the best but it work (thirdLevelElements will contain needed elements). I've used foreachs to make it clear, this logic could be written in linq statements but for me it will be complicated to understand.

var items = new[]
                            {
                                new Foo{id=1,parentid = 0, name="bar1"},
                                new Foo{id=2,parentid = 1, name="bar2"},
                                new Foo{id=3,parentid = 2, name="bar3"},
                                new Foo{id=4,parentid = 0, name="bar12"},
                                new Foo{id=5,parentid = 1, name="bar13"},
                                new Foo{id=6,parentid = 2, name="bar14"},
                                new Foo{id=7,parentid = 2, name="bar3"}
                            };

            var groups = items.GroupBy(item => item.parentid).ToList();
            var firstLevelElements = items.Where(item => item.name == "bar1");
            List<Foo> secondLevelElements = new List<Foo>();
            foreach (var firstLevelElement in firstLevelElements)
            {
                secondLevelElements.AddRange(groups[firstLevelElement.id]
                    .Where(item => item.name == "bar2"));
            }
            List<Foo> thirdLevelElements = new List<Foo>();
            foreach (var secondLevelElement in secondLevelElements)
            {
                thirdLevelElements.AddRange(groups[secondLevelElement.id]
                    .Where(item => item.name == "bar3"));
            }
user854301
  • 5,383
  • 3
  • 28
  • 37