I using Linq (together with EF) in order to access my database. I have object "Job", which contains several properties, some of them are "complex". My goal is to group jobs by these properties, and to get a count for each group.
Here my objects (simplified):
public class Job
{
[Key]
public int Id
{
get;
set;
}
[Required]
public Salary Salary
{
get;
set;
}
[Required]
public ICollection<Category> Categories
{
get;
set;
}
}
"Category" is a complex class, and looks like this:
public class Category
{
[Key]
public int Id
{
get;
set;
}
public Industry Industry //Example: Software
{
get;
set;
}
public Field Field //Example: .NET
{
get;
set;
}
public Position Position //Example: Developer
{
get;
set;
}
}
Industry, Field, Position and Salary classes contains just "int" id and "string" name.
I need to group list of Jobs by Industry, Field, Position and Salary and to get a count of each group. This is how I doing it right now:
var IndustryGroupsQuery = from t in Jobs.SelectMany(p => p.Categories)
group t by new { t.Industry} into g
select new
{
Tag = g.Key.Industry,
Count = g.Count()
};
var FieldsGroupsQuery = from t in Jobs.SelectMany(p => p.Categories)
group t by new { t.Field} into g
select new
{
Tag = g.Key.Field,
Count = g.Count()
};
var PositionsGroupsQuery = from t in Jobs.SelectMany(p => p.Categories)
group t by new { t.Position} into g
select new
{
Tag = g.Key.Position,
Count = g.Count()
};
Jobs.GroupBy(job => job.Salary)
.Select(group => new
{
Tag = group.Key,
Count = group.Count()
}))
This is works fine, but I wondering is it possible to improve somehow its performance.
Q1: I think, that probably one single query will perform better that four. Is it possible to combine these queries into one single query?
Q2: When I asking Linq to group by "Industry", how exactly it able to distinguish between one Industry to another? Is it implicitly comparing records' keys? Is it will be faster if I explicitly tell to linq which property to group by (e.g. "id") ?
Thanks!