3
Dictionary<string, Dictionary<string, ... >...> nestedDictionary;

Above Dictionary has a one-to-many relationship at each level from top to bottom. Adding an item is pretty easy since we have the leaf object and we start from bottom, creating dictionaries and adding each to the relevant parent...

My problem is when I want to find an item at the inner Dictionaries. There are two options:

  1. Nested foreach and find the item then snapshot all the loops at the moment we found the item and exit all loops. Then we know item pedigree is string1->string2->...->stringN. Problems with this solution is A) Performance B) Thread-safety (since I want to remove the item, the parent if it has no child and it's parent if it has no child...)
  2. Creating a reverse look-up dictionary and indexing added items. Something like a Tuple for all outer dictionaries. Then add the item as key and all the outer parents as Tuple members. Problem: A) Redundancy B) Keeping synchronized reverse look-up Dictionary with main Dictionary.

Any idea for a fast and thread-safe solution?

Xaqron
  • 29,931
  • 42
  • 140
  • 205
  • You mention Redundancy as a problem for the 2nd approach. Do you mean that in the sense of wasted memory? – hawk May 21 '11 at 03:46
  • @hawk: In second option, each item should hold data of all it's parents to the root. Consider nodeR as the root with childA which has 2 items (itemA & itemB). Now we should record itemA=>childA=>nodeR and itemB=>childA=>nodeR. Extend this example for more complicated scenarios. This is flattering a multi-dimensional array and the cost is in memory as you mentioned (also design doesn't make sense). – Xaqron May 21 '11 at 13:51
  • if memory is what you're worried about, you should consider string interning since it sounds like all the items you're talking about here are strings. we've used string interning to reduce memory usage, with quite some success in some of our applications. – hawk May 22 '11 at 07:46

2 Answers2

1

It looks like you actually have more than two levels of Dictionary. Since you cannot support a variable number of dictionaries using this type syntax:

 Dictionary<string, Dictionary<string, ... >...> nestedDictionary;

I can only assume that it is some number greater than two. Let's say that it's three. For any data structure you construct, you have an intended use and operations that you want to perform efficiently.

I'm going to assume you need calls like this:

var dictionary = new ThreeLevelDictionary();
dictionary.Add(string1, string2, string3, value);
var value = dictionary[string1, string2, string3];
dictionary.Remove(string1, string2, string3);

And (critical to the question) the reverse lookup you are describing:

var strings = dictionary.FindKeys(value);

If these are the operations that you need to perform and to perform quickly, then one data structure that you can use is a Dictionary with a Tuple key:

public class ThreeLevelDictionary<TValue> : Dictionary<Tuple<string, string, string>, TValue>
{
    public void Add(string s1, string s2, string s3, TValue value)
    {
        Add(Tuple.Create(s1, s2, s3), value);
    }

    public TValue this[string s1, string s2, string s3]
    {
        get { return this[Tuple.Create(s1, s2, s3)]; }
        set { value = this[Tuple.Create(s1, s2, s3)]; }
    }

    public void Remove(string s1, string s2, string s3)
    {
        Remove(Tuple.Create(s1, s2, s3);
    }

    public IEnumerable<string> FindKeys(TValue value)
    {
        foreach (var key in Keys)
        {
            if (EqualityComparer<TValue>.Default.Equals(this[key], value))
                return new string[] { key.Item1, key.Item2, key.Item3 };
        }
        throw new InvalidOperationException("missing value");
    }
}

Now you are perfectly positioned to create a reverse-lookup hashtable using another Dictionary if performance indicates that this is a bottleneck.

If the previous liked operations are not the ones you want to perform, then this data structure might not meet your needs. Either way, if you describe the interface first that summarizes what you want the data structure to do, then it's easier to see if there are other alternatives.

Rick Sladkey
  • 33,988
  • 6
  • 71
  • 95
  • You have flattered `one-to-many` relations into `one-to-one` at all levels except the inner one. This way there's no constrains for none of `Tuple` members to be unique. Obviously other operations like cascading are not supported and many more... – Xaqron May 21 '11 at 03:07
  • No, not at all. The dictionary key is now a multi-key. All possible combination of strings s1-s2-s3 are candidate keys. This has exactly the same relationship as the nested dictionaries. Put it this way: if you were going to store you data structure in a database, how would you do it? – Rick Sladkey May 21 '11 at 04:17
  • Then how can we get all children of `s2` for example? – Xaqron May 21 '11 at 14:27
1

Although I have little direct experience with the C5 collection library, it sounds like you could use their TreeDictionary class. It comes with a whole suite of useful methods for finding, iterating and modifying the tree, and is surprisingly well documented.

Another option would be to use the QuickGraph library (which you can find in NuGet or on codeplex). This involves some knowledge of graph theory but is otherwise a very useful library.

Both libraries require you to handle concurrency, just like the standard BCL collections.

Morten Mertner
  • 9,414
  • 4
  • 39
  • 56