6

I have a datatable, wich contains a several columns. I want to calculate the average of these with linq, without grouping: I tried this:

dtPointCollFb.GroupBy(r => 1).Select(r => new
        {
            Plus100 = r.Average(item => item.PLUS100),
            Plus50 = r.Average(item => item.PLUS50),
            Plus10 = r.Average(item => item.PLUS10),
            Plus5 = r.Average(item => item.PLUS5),
            Plus1 = r.Average(item => item.PLUS1),
            NULLA = r.Average(item => item.NULLA)
        }).ToList()

This works perfectly, when I calculate Sum, but with average produce the first record values, not all records' average. I think this is unnecessary: .GroupBy(r => 1) because I'm grouping with constant.

But without this i can't use average just for the whole query, and only for one column:

dtPointCollFb.Average(r => r.PLUS100)

So can someone produce a simple best practice solution? If it's possible with method syntax, not query.

I want something like this:

dtPointCollFb.GroupBy(r => 0).Select(r => new
        {
            Plus100 = r.Sum(item => item.PLUS100) / dtPointCollFb.Rows.Count,
            Plus50 = r.Sum(item => item.PLUS50) / dtPointCollFb.Rows.Count,
            Plus10 = r.Sum(item => item.PLUS10) / dtPointCollFb.Rows.Count,
            Plus5 = r.Sum(item => item.PLUS5) / dtPointCollFb.Rows.Count,
            Plus1 = r.Sum(item => item.PLUS1) / dtPointCollFb.Rows.Count,
            NULLA = r.Sum(item => item.NULLA) / dtPointCollFb.Rows.Count
        }).ToList()

but with simpler and cleaner way. This produce the average correctly.

speti43
  • 2,886
  • 1
  • 20
  • 23
  • Not the same, but there is a suggestion to calculate the averages separately. – speti43 Jun 07 '13 at 09:34
  • Just for the record, the `GroupBy` construct is, I believe, intended to force all the averages to be computed on a single round-trip to the database. – Rawling Jun 07 '13 at 10:19
  • Yes, you are correct, it would be the best way, because I transport a lot of data from the sql server. This pattern was designed to make multiple aggregation on the webserver with linq. I calculate different things from the same data collection. I could do it in 5 different sql query. I don't know, maybe it would be better. – speti43 Jun 07 '13 at 11:21

2 Answers2

6

Enumerable.Average computes an average for a sequence of numbers. So you need to project (that is Select) a column for each average you need.

dtPointCollFb.Select(r => r.PLUS100).Average()

or

dtPointCollFb.Average(r => r.PLUS100)    

GroupBy builds a list of lists. In this case the (outer) list has one element. (since you're arbitrarily using the same key for all of the elements in the original list)

So what you have above could just as easily be written as:

var averages = new
{
    Plus100 = dtPointCollFb.Average(item => item.PLUS100),
    Plus50 = dtPointCollFb.Average(item => item.PLUS50),
    Plus10 = dtPointCollFb.Average(item => item.PLUS10),
    Plus5 = dtPointCollFb.Average(item => item.PLUS5),
    Plus1 = dtPointCollFb.Average(item => item.PLUS1),
    NULLA = dtPointCollFb.Average(item => item.NULLA)
};

Doing anything more would require customized extension functions (extending IEnumerable<DataTableClass>).

Christopher Stevenson
  • 2,843
  • 20
  • 25
3

If, as Christopher suggested, you want to use extensions, you can use the Aggregate method.

Given the class

public class DataPoint
{
    public double Plus100 { get; set; }
    public double Plus50 { get; set; }
    public double Plus10 { get; set; }
    public double Plus5 { get; set; }
    public double Plus1 { get; set; }
    public double NULLA { get; set; }
}

the averaging function would look like this (the manual counting avoids multiple enumerations through the collection):

public static DataPoint Average(this IEnumerable<DataPoint> dataPoints)
{
    var count = 0;
    var totals = dataPoints.Aggregate((lhs, rhs) =>
        {
            ++count;
            return new DataPoint
                {
                    Plus100 = lhs.Plus100 + rhs.Plus100,
                    Plus50 = lhs.Plus50 + rhs.Plus50,
                    Plus10 = lhs.Plus10 + rhs.Plus10,
                    Plus5 = lhs.Plus5 + rhs.Plus5,
                    Plus1 = lhs.Plus1 + rhs.Plus1,
                    NULLA = lhs.NULLA + rhs.NULLA
                };
        });

    return new DataPoint
        {
            Plus100 = totals.Plus100 / count,
            Plus50 = totals.Plus50 / count,
            Plus10 = totals.Plus10 / count,
            Plus5 = totals.Plus5 / count,
            Plus1 = totals.Plus1 / count,
            NULLA = totals.NULLA / count
        };
}

The advantage of this method is that it only goes through the collection once. If you have a large dataset, this will save compute power. If you have fewer datapoints, I would use Christopher's method.

Rob Lyndon
  • 12,089
  • 5
  • 49
  • 74
  • Good suggestion, I've made it like you did, but i'haven't made a linq extension method, because I use this calculation only once. But I also made a nested class, like you also made. Thanks! – speti43 Jun 07 '13 at 10:27