3

I have an IEnumerable<IEnumerable<CustomObject>>s where CustomObject has an x (which is used like a key (in this case 1, 2, 3)) and a y value. Some pretend data:

{
  { {1, 2}, {2, 4}, {3, 6}, {4, 8} }, 
  { {1, 2}, {2, 0}, {3, 0}, {4,-2} },
  { {1, 2}, {2, 2}, {3, 0}, {4, 0} }
}

What is the best way I can retrieve the following IEnumerable<CustomObject>:

{ {1, 2}, {2, 2}, {3, 2}, {4, 2} }

I.e. the average of the y values for each element.

Performance needs to be reasonable, so no .ToList() or similar can be used. I've been trying various things with LINQ but to no avail.

Update

@Bort, @Rawling, I've tested your answers and @Rawling's is very slightly faster. @Bort's answer, however, is more readable so I think I will go with that for the moment. Please feel free to keep answers coming!

dav_i
  • 27,509
  • 17
  • 104
  • 136

4 Answers4

5

Using LINQ, you can flatten out the list of lists with SelectMany, then GroupBy x and Select the average:

var averages = customObjectLists
    .SelectMany(l => l)
    .GroupBy(co => co.x)
    .Select(g => new CustomObject { x => g.Key, y = g.Average(co => co.y) });
Bort
  • 7,398
  • 3
  • 33
  • 48
  • 1
    Does `GroupBy` not count as calling `ToList()` - it does solidify the whole data set. – Rawling Aug 09 '12 at 13:36
  • If you are calculating averages you **need** to iterate through the whole dataset - else you cannot calculate the average. – Maarten Aug 09 '12 at 13:59
  • Yes you need to iterate through it all, but it doesn't mean you need to load it all into memory at once. – Rawling Aug 09 '12 at 14:11
  • Where does it say that GroupBy loads the whole dataset? – Maarten Aug 09 '12 at 17:40
  • @Maarten: It does. Use reflector and you'll see it putting each element into an array. Use this code on infinite sequences and you'll get an `OutOfMemoryException` before you get any results. Think about it and you'll see you can't get an average across a group unless you know there aren't any more elements in the input that go in that group - and you can't know that until the input is exhausted, and you need to have stored all the elements from other groups somewhere. – Rawling Aug 10 '12 at 07:16
  • @Maarten (I'd like to retract that last "think about it" point, obviously from your first comment you know that.) – Rawling Aug 10 '12 at 07:35
1

Something like this should get you the results you're looking for. It will flatten out the list of lists into a single List<CustomObject>, and then group by the X value and average the Y value, leaving you with an IEnumerable of an anonymous type with X and Y properties. You can change the select new {} ... to call a constructor for CustomObject, and you will get an IEnumerable<CustomObject>.

var myComplexObject = //your IEnumerable<IEnumerable<CustomObject>>
var result = from firstList in myComplexObject
        from secondList in firstList
        group secondList by secondList.X into grp
        select new {X = grp.Key, Y = (int)grp.Average(p=>p.Y)};
goric
  • 11,491
  • 7
  • 53
  • 69
1

If you don't mind solidifying the outer enumerator, the following LINQy method will defer execution of the inner enumerators.

IEnumerable<V> AggregateAcross<T, U, V>(
            IEnumerable<IEnumerable<T>> input,
            Func<T, U> select,
            Func<IEnumerable<U>, V> aggregate)
    {
        var enumerators = input.Select(ie => ie.GetEnumerator()).ToArray();
        while (enumerators.All(e => e.MoveNext()))
        {
            yield return aggregate(enumerators.Select(e => select(e.Current)));
        }
    }

Call as e.g.

foreach (var avg in AggregateAcross(
                     input,
                     pair => pair.y,
                     e => e.Average(y => y)))
{
    Console.WriteLine(avg);
}

Note that this stops as soon as one of the inner enumerators runs out of elements. Also, it needs something to dispose all the enumerators when you're done. Take a look at this answer for further ideas.

(Also note that this is completely ignoring the x values. As all your inputs are in order, and your desired output is also in order, the x values don't add anything.)

Community
  • 1
  • 1
Rawling
  • 49,248
  • 7
  • 89
  • 127
-1

I didn't test it, but I think this should work.

public void Test() {
    IEnumerable<IEnumerable<CustomObject>> data = ...;
    var result = data
        .SelectMany(x => x)
        .GroupBy(
            item => item.x,
            (key, r) => new { x = key, data = r.Select(z => z.y) }
        )
        .Select(x => new CustomObject { x = x.x, y = (int)x.data.Average() })
        .ToList();
}
Maarten
  • 22,527
  • 3
  • 47
  • 68