0

I'd like to compute a merged dictionary based on an input of List<IDictionary<string,object>>.

Typically this would be the combination of the explicit merged_dataset copy followed by a foreach:

var merged_dataset = my_datasets.SelectMany(dict => dict)
             .ToLookup(pair => pair.Key, pair => pair.Value)
             .ToDictionary(group => group.Key, group => group.ToList());
foreach (var item in merged_dataset)
  [...logic based on item.Value.Distinct()...]

Is there a way to do the above "on-the-fly", without an explicit call to ToDictionary which constructs merged_dataset and somewhat like a Enumerable.Zip with two datasets, but with a List of N-datasets as input ?

In the end I would end up with:

var iterator = my_datasets.SelectMany(dict => dict)
         .GroupBy(dict => dict.Key)[...missing ToList...]
foreach ((string key, List<object> values) in iterator)

References:

malat
  • 12,152
  • 13
  • 89
  • 158
  • 6
    Just replace `ToLookup` with `GroupBy` and remove `ToDictionary`? – Evk Sep 20 '21 at 12:40
  • Missing `ToList()` – malat Sep 20 '21 at 15:31
  • Question is a bit unclear to me still. "without expicit merged_dataset" you can just replace that variable with expression itself in foreach statement. – Evk Sep 20 '21 at 16:02
  • @Evk Let me know if the question is a bit less unclear. Thanks – malat Sep 21 '21 at 05:58
  • Are you willing to have a lot of overhead from looking up in the `List` for all the keys? What is your goal for avoiding `ToDictionary`? Also, why not put `Distinct` in `ToDictionary`, or just use the result of `ToLookup` directly? – NetMage Sep 21 '21 at 21:11
  • Apparently you don't realize that `GroupBy` does essentially the same thing as `ToLookup()` and has to pre-merge all the keys before it can return a single group, so I don't think that does what you want? – NetMage Sep 21 '21 at 21:34

1 Answers1

0

You could create a class to implement a multi-dictionary that essentially maps List<Dictionary<TKey,TValue>> to Dictionary<TKey,IEnumerable<TValue>>. Here is a sample implementation, but because it computes everything as requested, it is not very efficient for repeated operations. If the list is small, and you are just enumerating once, it may be more efficient than pre-creating a new Lookup and enumerating that, but probably not.

public class MultiDictionary<TKey,TValue> : IEnumerable<KeyValuePair<TKey, IEnumerable<TValue>>> {
    List<Dictionary<TKey,TValue>> dictionaries;

    public MultiDictionary(List<Dictionary<TKey, TValue>> src) {
        dictionaries = src;
    }

    public IEnumerable<TKey> Keys => dictionaries.SelectMany(d => d.Keys).Distinct();
    public IEnumerable<TValue> Values => dictionaries.SelectMany(d => d.Values);

    public bool ContainsKey(TKey key) => dictionaries.Any(d => d.ContainsKey(key));
    public bool ContainsValue(TValue value) => dictionaries.Any(d => d.ContainsValue(value));

    public IEnumerable<TValue> this[TKey key] => dictionaries.Where(d => d.ContainsKey(key)).Select(d => d[key]);
    public int Count => Keys.Count();

    IEnumerator<KeyValuePair<TKey, IEnumerable<TValue>>> IEnumerable<KeyValuePair<TKey, IEnumerable<TValue>>>.GetEnumerator()
        => Keys.Select(k => new KeyValuePair<TKey, IEnumerable<TValue>>(k, this[k])).GetEnumerator();

    public IEnumerator GetEnumerator() => GetEnumerator();
}

One way to improve the performance would be to create a Lookup mapping all the keys to the dictionaries that contain them, but then you might as well create a merged Lookup that combines all the key value pairs.

NetMage
  • 26,163
  • 3
  • 34
  • 55