3

I'm trying to use Linq expressions to construct a query, and am stuck trying to group by multiple columns. Say I have a basic collection:

IEnumerable<Row> collection = new Row[]
{
    new Row() { Col1 = "a", Col2="x" },
    new Row() { Col1 = "a", Col2="x" },
    new Row() { Col1 = "a", Col2="y" },
};

I know you can group these using lambda expressions:

foreach (var grp in collection.GroupBy(item => new { item.Col1, item.Col2 }))
{
    Debug.Write("Grouping by " + grp.Key.Col1 + " and " + grp.Key.Col2 + ": ");
    Debug.WriteLine(grp.Count() + " rows");
}

This groups correctly as you can see:

Grouping by a and x: 2 rows
Grouping by a and y: 1 rows

But now, say I receive a collection of selectors to group against, that is passed to me as a parameter in my method, and that the entity type is generic:

void doLinq<T>(params Expression<Func<T,object>>[] selectors)
{
    // linq stuff
}

Whoever's invoking the method would call like this:

doLinq<Row>(entity=>entity.Col1, entity=>entity.Col2);

How would I construct the group-by expression?

foreach (var grp in collection.GroupBy(
      item => new { 
          // selectors??
      }))
{
    // grp.Key. ??
}

Edit

I updated above to hopefully clarify why I need the set of selectors.

Edit #2

Made the entity type in doLinq generic.

McGarnagle
  • 101,349
  • 31
  • 229
  • 260
  • 1
    Seems like you should be able to write a function that combines the several `Expression>` into a single `Expression>` that returns a tuple or an array of values or whatever. It looks like you're using linq to sql, and I don't know enough about the translation to sql to know whether you could do this in a translatable way, however. – phoog Apr 13 '12 at 20:15
  • @phoog it does seem like you should be able to combine them into a single expression-- I still don't know how to do that. I ended up combining the results of the expressions instead, for each row. – McGarnagle Apr 21 '12 at 20:28
  • You should post your solution as a new answer, not edit the question. – svick Apr 21 '12 at 20:30

4 Answers4

1

You should look at Dynamic Linq: http://blogs.msdn.com/b/mitsu/archive/2008/02/07/linq-groupbymany-dynamically.aspx

Vladimir Perevalov
  • 4,059
  • 18
  • 22
  • Very interesting and useful link, but it's not quite what I want. Mitsu is doing sub-groupings in a hierarchical relationship, but each grouping is done on a single key-- ie, group by countries, then for each group, group by cities. I want to have a single level of grouping, but multiple properties in the key, eg, group by country and age bracket (Col1 and Col2 in my example). – McGarnagle Apr 13 '12 at 18:59
  • Sorry I missed the target :) Here's a link similar question with what I meant as a solution: http://stackoverflow.com/questions/3929041/dynamic-linq-groupby-multiple-columns By Dynamic Linq I suggested using string based linq interfaces. – Vladimir Perevalov Apr 13 '12 at 19:04
1

Well, I'll assume you use linq-to-sql or something similar, so you need expression trees. If not there might be other possibilities.

Possible solutions I can see:

  • dynamic linq

see Vladimir Perevalovs answer.

  • constructing the whole groupby expression-tree manually

see http://msdn.microsoft.com/en-us/library/bb882637.aspx

  • ugly workaround

Well, that's my departement :)

untested code:

 void doLinq(params string[] selectors) // checking two expressions for equality is messy, so I used strings
     foreach (var grp in collection.GroupBy(
          item => new { 
              Col1 = (selectors.Contains("Col1") ? item.Col1 : String.Empty),
              Col2 = (selectors.Contains("Col2") ? item.Col2 : String.Empty)
              // need to add a line for each column :(
          }))
     {
          string[] grouping = (new string[]{grp.Key.Col1, grp.Key.Col2 /*, ...*/ }).Where(s=>!s.IsNullOrEmpty()).ToArray();
          Debug.Write("Grouping by " + String.Join(" and ", grouping)+ ": ");
          Debug.WriteLine(grp.Count() + " rows");
     }
 }
HugoRune
  • 13,157
  • 7
  • 69
  • 144
  • I realized I simplified the question too much at first- it obscured what I was trying to do. The type "Row" is actually generic (see update above), which would seem to rule out #1 and #3. #2 looks very interesting, I'm checking it out ... – McGarnagle Apr 15 '12 at 17:11
1

I have extremely limited knowledge of linq-to-sql, but is it really important what's inside GroupBy? Because if it is not, you can roll out your own keySelector. Anyway, I tried it with both Sql Server CE and Sql Server Express and this seems to work:

using System;
using System.Linq;
using System.Collections.Generic;
using System.Data.Linq;
using System.Linq.Expressions;

namespace ConsoleApplication1 {
    class Props {
        public List<object> list = new List<object>();
        public override bool Equals(object obj) {
            return Enumerable.SequenceEqual(list, (obj as Props).list);
        }
        public override int GetHashCode() {
            return list.Select(o => o.GetHashCode()).Aggregate((i1, i2) => i1 ^ i2);
        }
    }
    class Program {
        static void Main(string[] args) {
            Lol db = new Lol(@"Data Source=.\SQLExpress;Initial Catalog=Lol;Integrated Security=true");
            db.Log = Console.Out;
            doLinq(db.Test, row => row.Col1, row => row.Col2);
            Console.ReadLine();
        }
        static void doLinq<T>(Table<T> table, params Func<T, object>[] selectors) where T : class {
            Func<T, Props> selector = item => {
                var props = new Props();
                foreach (var sel in selectors) props.list.Add(sel(item));
                return props;
            };
            foreach (var grp in table.GroupBy(selector)) {
                Console.Write("Grouping by " + string.Join(", ", grp.Key.list) + ": ");
                Console.WriteLine(grp.Count() + " rows");
            }
        }
    }
}

Lol database has one table "Test" with three rows. The output is this:

SELECT [t0].[Col1], [t0].[Col2]
FROM [dbo].[Test] AS [t0]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 4.0.30319.1

Grouping by a, x: 2 rows
Grouping by a, y: 1 rows

I checked the queries, and it seems that linq-to-sql is smart enough not to generate sql for groupBy when it can't, so it will iterate through all rows of the table and then group them on the client.

Edit: minor additions for completion's sake, and the connection string now assumes Sql Server Express.

user1096188
  • 1,809
  • 12
  • 11
  • Ingenious, but I couldn't get it working ... I'm not sure why. As is, the "Props.Equals" method returns true always in my test case; it seemed to be comparing the Linq expressions rather than the evaluations. Don't you need to use "Expression.Compile()" somewhere in there? – McGarnagle Apr 21 '12 at 20:25
  • Well `GroupBy` organises `Props` into a hash table, so `Props.Equals` will only be called for objects for which `Props.GetHashCode` return equal values, which means that they are most likely equal already. So by itself, `Props.Equals` always returning true is hardly a problem. You also don't need `Expressions` anywhere in the code, since the whole grouping happens on the client and generated sql is a simple `select` from a table. I also tried it with SQL Server Express with the identical results. – user1096188 Apr 22 '12 at 06:35
0

The solution worked for me. It involves two parts:

  • create a grouping object (which I implemented inelegantly as object[]) given the row value and the set of selectors. This involves a lambda expression that compiles and invokes each selector on the row item.
  • implement IEquality for the grouping object type (in my case that's IEqualityComparer).

First part

foreach (System.Linq.IGrouping<object[], T> g in collection.GroupBy(
    new Func<T, object[]>(
        item => selectors.Select(sel => sel.Compile().Invoke(item)).ToArray()
    ),
    new ColumnComparer()
)
{ ... }

Second Part

public class ColumnComparer : IEqualityComparer<object[]>
{
    public bool Equals(object[] x, object[] y)
    {
        return Enumerable.SequenceEqual(x, y);
    }

    public int GetHashCode(object[] obj)
    {
        return (string.Join("", obj.ToArray())).GetHashCode();
    }
}

This works for basic Linq, and Linq for the MySql connector. Which other Linq providers, and which expression types this works for is a whole other question ...

McGarnagle
  • 101,349
  • 31
  • 229
  • 260
  • 1
    I think this will query the database for all rows, then perform the grouping locally. So this will not work for linq-to-sql with a ms-sql server where the query is translated to t-sql, you would need an expression tree for that. Still, even in linq-to-sql you could always prepend .AsEnumerable() before your GroupBy() to force executing the grouping locally – HugoRune Apr 21 '12 at 22:02