I have a sequence of items, and want to group them by a key and calculate several aggregations for each key.
The number of items is large, but the number of distinct keys is small.
A toy example:
static List<(string Key, decimal Sum, int Count)> GroupStats(
IEnumerable<(string Key, decimal Value)> items)
{
return items
.GroupBy(x => x.Key)
.Select(g => (
Key : g.Key,
Sum : g.Sum(x => x.Value),
Count : g.Count()
))
.ToList();
}
Using Linq's GroupBy
has the unfortunate consequence that it'll need to load all the items into memory.
An imperative implementation would only consume memory proportional to the number of distinct keys, but I'm wondering if there is a nicer solution.
Reactive Extension's "push" approach should theoretically enable low memory grouping as well, but I didn't find a way to escape from IObservable
to materialize the actual values. I'm also open to other elegant solutions (besides the obvious imperative implementation).