I have the following database structure (EDMX models):
public class Company
{
public Guid Id { get; set; }
public virtual ICollection<Location> Locations { get; set; }
// ... 15 more columns
}
public class Location
{
public Guid Id { get; set; }
public virtual Company Company { get; set; }
public Guid CompanyId { get; set; }
public virtual ICollection<ReportA> ReportsA { get; set; }
public virtual ICollection<ReportB> ReportsB { get; set; }
public virtual ICollection<ReportC> ReportsC { get; set; }
// ... 15 more columns with information - name, description etc.
}
public class ReportA
{
public virtual Location Location { get; set; }
public Guid LocationId { get; set; }
// 30 more columns of type "int?"
}
public class ReportB
{
public virtual Location Location { get; set; }
public Guid LocationId { get; set; }
// 30 more columns of type "int?"
}
public class ReportC
{
public virtual Location Location { get; set; }
public Guid LocationId { get; set; }
// 30 more columns of type "int?"
}
One company can have many locations. Each location have many reports from A, B and C.
The columns from ReportA
, ReportB
, ReportC
tables are different.
Each one of the Report
tables have аpproximately 40 000 rows.
The Company
and Location
tables have ~ 5000 rows
I need to get all of the data and make a summary report.
The code is:
using (ComapnyEntities dataBaseContext = new ComapnyEntities())
{
IQueryable<Locations> query = dataBaseContext.Locations
.AsNoTracking()
.Where(location => companyIds.Contains(location.CompanyId))
.Include(location => location.Company)
.Include(location => location.ReportsA)
.Include(location => location.ReportsB)
.Include(location => location.ReportsC);
// more filtation
return query.ToList();
// assume that companyIds have all company Ids
}
In most cases I can use Skip()
and Take()
methods which makes execution faster (i.e. .Take(10)
), but in one special case I need to pull all locations along with company information, ReportsA, ReportsB, ReportsC.
based on comment in Stackoverflow, Include()
makes Cartesian product by rows. https://stackoverflow.com/a/22625208/6142097
// so if that is true:
4 000 * 4 000 * 40 000 * 40 000 * 40 000 = ?? (1.024e+21)
The "good part" is that the page takes 15 sec to load (locally), but in production is not the same and takes longer.
So is there a way to improve the performance of the query?
- I Tried with Entity-framework code is slow when using Include() many times - it's slower
- I have added Non-Clustured index of
Location.CompanyId
,ReportA.LocationId
,ReportB.LocationId
,ReportC.LocationId
,LocationId
- no difference (I am not sure indexes are correctly added) - Cannot construct native SQL query, that joins all the tables, which can be executed for less than 10 sec.
- I Tried to replace
.Inlude()
with.IncludeOptimized()
but no real difference.
My machine is using SQL Server 2014 with windows 8.1
What are your suggestions? Is there a way to improve my code?