0

I using Linq (together with EF) in order to access my database. I have object "Job", which contains several properties, some of them are "complex". My goal is to group jobs by these properties, and to get a count for each group.

Here my objects (simplified):

public class Job
{
    [Key]
    public int Id
    {
        get;
        set;
    }


    [Required]
    public Salary Salary
    {
        get;
        set;
    }


    [Required]
    public ICollection<Category> Categories
    {
        get;
        set;
    }     
}

"Category" is a complex class, and looks like this:

public class Category
{
    [Key]
    public int Id
    {
        get;
        set;
    }

    public Industry Industry //Example: Software
    {
        get;
        set;
    }


    public Field Field //Example: .NET
    {
        get;
        set;
    }


    public Position Position //Example: Developer
    {
        get;
        set;
    }
}

Industry, Field, Position and Salary classes contains just "int" id and "string" name.

I need to group list of Jobs by Industry, Field, Position and Salary and to get a count of each group. This is how I doing it right now:

var IndustryGroupsQuery = from t in Jobs.SelectMany(p => p.Categories)
                                          group t by new { t.Industry} into g
                                          select new 
                                          { 
                                              Tag = g.Key.Industry,
                                              Count = g.Count()
                                          };

var FieldsGroupsQuery = from t in Jobs.SelectMany(p => p.Categories)
                                          group t by new { t.Field} into g
                                          select new 
                                          { 
                                              Tag = g.Key.Field,
                                              Count = g.Count()
                                          };

var PositionsGroupsQuery = from t in Jobs.SelectMany(p => p.Categories)
                                          group t by new { t.Position} into g
                                          select new 
                                          { 
                                              Tag = g.Key.Position,
                                              Count = g.Count()
                                          };

Jobs.GroupBy(job => job.Salary)
                       .Select(group => new
                       {
                           Tag = group.Key,
                           Count = group.Count()
                       }))

This is works fine, but I wondering is it possible to improve somehow its performance.

Q1: I think, that probably one single query will perform better that four. Is it possible to combine these queries into one single query?

Q2: When I asking Linq to group by "Industry", how exactly it able to distinguish between one Industry to another? Is it implicitly comparing records' keys? Is it will be faster if I explicitly tell to linq which property to group by (e.g. "id") ?

Thanks!

Illidan
  • 4,047
  • 3
  • 39
  • 49

2 Answers2

0

Answer in reverse order:

Q2:
When you group by an object instead of a base type, it uses the standard equality comparer (obj x == obj y) which does a simple reference comparison (http://msdn.microsoft.com/en-us/library/bsc2ak47(v=vs.110).aspx). If that suits, it works, otherwise you can implement a custom equality comparer (How to implement IEqualityComparer to return distinct values?)

Q1:
If you wanted sub-groups of the groups, then you can do it in a single query. If you just want the counts for each, then you are doing it exactly the right way.

Community
  • 1
  • 1
David Colwell
  • 2,450
  • 20
  • 31
0

You can user conditional GROUP BY.

You can define a variable to tell the query which column to use for grouping. You can define an ENUM for GROUP BY columns.

 int groupByCol = 1; //Change the value of this field according to the field you want to group by

 var GenericGroupsQuery = from t in Jobs                                          
                             group t by new { GroupCol = ( groupByCol == 1 ? t.Industry:(groupByCol == 2 ? t.Field:(groupByCol == 3 ? t.Position : t.Job)))} into g
                             select new 
                             { 
                                Tag = g.Key,
                                Count = g.Count()
                             };
Dipendu Paul
  • 2,685
  • 1
  • 23
  • 20