1

Given an enumeration of records in the format:

Name (string)
Amount (number)

For example:

Laverne   4
Lenny     2
Shirley   3
Squiggy   5

I want to group the records, so that each group's total Amount does not exceed some limit-per-group. For example, 10.

Group 1 (Laverne,Lenny,Shirley) with Total Amount 9
Group 2 (Squiggy) with Total Amount 5

The Amount number is guaranteed to always be less than the grouping limit.

Craig Celeste
  • 12,207
  • 10
  • 42
  • 49

4 Answers4

2

Here I have a solution using only LINQ functions:

// Record definition
class Record
{
    public string Name;
    public int Amount;
    public Record(string name, int amount)
    {
        Name = name;
        Amount = amount;
    }
}

// actual code for setup and LINQ
List<Record> records = new List<Record>()
{
    new Record("Laverne", 4),
    new Record("Lenny", 2),
    new Record("Shirley", 3),
    new Record("Squiggy", 5)
};
int groupLimit = 10;

// the solution
List<Record[]> test = 
    records.GroupBy(record => records.TakeWhile(r => r != record)
                                     .Concat(new[] { record })
                                     .Sum(r => r.Amount) / (groupLimit + 1))
           .Select(g => g.ToArray()).ToList();

This gives the correct result:

test = 
{
    { [ "Laverne", 4 ], [ "Lenny", 2 ], [ "shirley", 3 ] },
    { [ "Squiggly", 5 ] }
}

The only downside is that this is O(n2). It essentially groups by the index of the group (as defined by using the sum of the record up to the current one). Note that groupLimit + 1 is needed so that we actually include groups from 0 to groupLimit, inclusive.

I'm trying to find a way of making it prettier, but it doesn't look easy.

Community
  • 1
  • 1
Jashaszun
  • 9,207
  • 3
  • 29
  • 57
  • +1 This was helpful. I marked Jon's answer as accepted because it O(n) and suits my needs since my data is already in memory as an array. Cheers! – Craig Celeste Aug 20 '15 at 14:42
2

If you allow for captured variables to maintain state, then it becomes easier. If we have:

int limit = 10;

Then:

int groupTotal = 0;
int groupNum = 0;
var grouped = records.Select(r =>
{
    int newCount = groupTotal + r.Amount;
    if (newCount > limit)
    {
        groupNum++;
        groupTotal = r.Amount;
    }
    else
        groupTotal = newCount;
    return new{Records = r, Group = groupNum};
}
).GroupBy(g => g.Group, g => g.Records);

It's O(n), and just a Select and a GroupBy, but the use of captured variables may not be as portable across providers as one may want though.

For linq-to-objects though, it's fine.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
0

A dotnet fiddle with a solution using Aggregate:

https://dotnetfiddle.net/gVgONH

using System;
using System.Collections.Generic;
using System.Linq;

public class Program
{
    // Record definition
    public class Record
    {
        public string Name;
        public int Amount;
        public Record(string name, int amount)
        {
            Name = name;
            Amount = amount;
        }
    }

    public static void Main()
    {
        // actual code for setup and LINQ
        List<Record> records = new List<Record>()
        {
        new Record("Alice", 1), new Record("Bob", 5), new Record("Charly", 4), new Record("Laverne", 4), new Record("Lenny", 2), new Record("Shirley", 3), new Record("Squiggy", 5)}

        ;
        int groupLimit = 10;
        int sum = 0;
        var result = records.Aggregate(new List<List<Record>>(), (accumulated, next) =>
        {
            if ((sum + next.Amount >= groupLimit) || accumulated.Count() == 0)
            {
                Console.WriteLine("New team: " + accumulated.Count());
                accumulated.Add(new List<Record>());
                sum = 0;
            }

            sum += next.Amount;
            Console.WriteLine("New member {0} ({1}): adds up to {2} ", next.Name, next.Amount, sum);
            accumulated.Last().Add(next);
            return accumulated;
        }

        );
        Console.WriteLine("Team count: " + result.Count());
    }
}

With output:

New team: 0
New member Alice (1): adds up to 1 
New member Bob (5): adds up to 6 
New team: 1
New member Charly (4): adds up to 4 
New member Laverne (4): adds up to 8 
New team: 2
New member Lenny (2): adds up to 2 
New member Shirley (3): adds up to 5 
New team: 3
New member Squiggy (5): adds up to 5 
Team count: 4
Pieter21
  • 1,765
  • 1
  • 10
  • 22
-1

There is no 'performant' way to do this with the built in Linq operators that I am aware of. You could create your own extension method, though:

public static class EnumerableExtensions
{
    public static IEnumerable<TResult> GroupWhile<TSource, TAccumulation, TResult>(
        this IEnumerable<TSource> source,
        Func<TAccumulation> seedFactory,
        Func<TAccumulation, TSource, TAccumulation> accumulator,
        Func<TAccumulation, bool> predicate,
        Func<TAccumulation, IEnumerable<TSource>, TResult> selector)
    {
        TAccumulation accumulation = seedFactory();
        List<TSource> result = new List<TSource>();
        using(IEnumerator<TSource> enumerator = source.GetEnumerator())
        {
            while(enumerator.MoveNext())
            {
                if(!predicate(accumulator(accumulation, enumerator.Current)))
                {
                    yield return selector(accumulation, result);
                    accumulation = seedFactory();
                    result = new List<TSource>();
                }
                result.Add(enumerator.Current);
                accumulation = accumulator(accumulation, enumerator.Current); 
            }

            if(result.Count > 0)
            {
                yield return selector(accumulation, result);
            }
        }
    }
}

And then call it like this:

int limit = 10;
var groups =
    records
    .GroupWhile(
        () => 0,
        (a, x) => a + x,
        (a) => a <= limit,
        (a, g) => new { Total = a, Group = g });

The way it is currently written, if any single record exceeds that limit then that record is returned by itself. You could modify it to exclude records that exceed the limit or leave it as is and perform the exclusion with Where.

This solution has O(n) runtime.

Jason Boyd
  • 6,839
  • 4
  • 29
  • 47
  • Didn't downvote, but part of your answer is wrong (`There is no way to do this with the built in Linq operators`) and the rest seems much too verbose. Take a look at mine. It's concise and uses LINQ, as the asker wanted. – Jashaszun Aug 19 '15 at 23:04
  • @Jashaszun - I have modified it to say 'no performant way'. I stand by that. If anybody can find a O(n) solution using just the standard Linq operators then I salute them. I do not know what aspect you are referring to when you say it is verbose. If you are talking about the extension method, that is a one time cost and one worth paying, IMHO, if it means I get O(n) performance when I call it in the future. If you are talking about the method signature, well, you have me there, but aggregate functions tend to be verbose by nature. I am not sure anything can be done about that. – Jason Boyd Aug 20 '15 at 00:01