0

Right now, I'm trying to write a method for a survey submission program that utilizes a very normalized schema.

I have a method that is meant to generate a survey for a team of people, linking several different EF models together in the process. However, this method runs EXTREMELY slowly for anything but the smallest team sizes (taking 11.2 seconds to execute for a 4-person team, and whopping 103.9 seconds for an 8 person team). After some analysis, I found that 75% of the runtime is taken up in the following block of code:

 var TeamMembers = db.TeamMembers.Where(m => m.TeamID == TeamID && m.OnTeam).ToList();
                    foreach (TeamMember TeamMember in TeamMembers)
                    {
                        Employee employee = db.Employees.Find(TeamMember.EmployeeID);
                        SurveyForm form = new SurveyForm();
                        form.Submitter = employee;
                        form.State = "Not Submitted";
                        form.SurveyGroupID = surveygroup.SurveyGroupID;
                        db.SurveyForms.Add(form);
                        db.SaveChanges();

                        foreach (TeamMember peer in TeamMembers)
                        {
                            foreach (SurveySectionDetail SectionDetail in sectionDetails)
                            {
                                foreach (SurveyAttributeDetail AttributeDetail in attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == SectionDetail.SurveySectionDetailID) )
                                {
                                    SurveyAnswer answer = new SurveyAnswer();
                                    answer.Reviewee = peer;
                                    answer.SurveyFormID = form.SurveyFormID;
                                    answer.Detail = AttributeDetail;
                                    answer.SectionDetail = SectionDetail;
                                    db.SurveyAnswers.Add(answer);
                                    db.SaveChanges();
                                }
                            }
                        }
                    }

I'm really at a loss as to how I might go about cutting back the runtime. Is this just the price I pay for having this many related entities? I know that joins are expensive operations, and that I've essentially got 3 Or is there some inefficiency that I'm overlooking?

Thanks for your help!

EDIT: As requested by Xiaoy312, here's how sectionDetails and attributeDetails are defined:

SurveyTemplate template = db.SurveyTemplates.Find(SurveyTemplateID);
List<SurveySectionDetail> sectionDetails = new List<SurveySectionDetail>();
List<SurveyAttributeDetail> attributeDetails = new List<SurveyAttributeDetail>();
                    foreach (SurveyTemplateSection section in template.SurveyTemplateSections)
                    {
                        SurveySectionDetail SectionDetail = new SurveySectionDetail();
                        SectionDetail.SectionName = section.SectionName;
                        SectionDetail.SectionOrder = section.SectionOrder;
                        SectionDetail.Description = section.Description;
                        SectionDetail.SurveyGroupID = surveygroup.SurveyGroupID;
                        db.SurveySectionDetails.Add(SectionDetail);
                        sectionDetails.Add(SectionDetail);
                        db.SaveChanges();

                        foreach (SurveyTemplateAttribute attribute in section.SurveyTemplateAttributes)
                        {
                            SurveyAttributeDetail AttributeDetail = new SurveyAttributeDetail();
                            AttributeDetail.AttributeName = attribute.AttributeName;
                            AttributeDetail.AttributeScale = attribute.AttributeScale;
                            AttributeDetail.AttributeType = attribute.AttributeType;
                            AttributeDetail.AttributeOrder = attribute.AttributeOrder;
                            AttributeDetail.SectionDetail = SectionDetail;
                            db.SurveyAttributeDetails.Add(AttributeDetail);
                            attributeDetails.Add(AttributeDetail);
                            db.SaveChanges();  
                        }
                    }
Tom Kreamer
  • 113
  • 1
  • 2
  • 12
  • 1
    It seems strange that you nested a TeamMembers foreach inside another TeamMembers foreach. Is this intended? – Xiaoy312 Aug 19 '15 at 17:04
  • Yep, that's the functionality I'm going for. There's a recursive relationship going on there where one TeamMember is submitting a review (he's the "reviewer" ) and the other is being reviewed (the "reviewee"). – Tom Kreamer Aug 19 '15 at 17:12
  • It looks like you are doing several round trip to the database. Howeverm it is hard to assess the situation without knowing how are `sectionDetails` & `attributeDetails` defined. – Xiaoy312 Aug 19 '15 at 17:25
  • Any particular reason for immediately saving each time, instead of just once at the end? –  Aug 19 '15 at 17:43
  • How much time it takes if you comment the `db.SaveChanges`? – Taher Rahgooy Aug 19 '15 at 18:25
  • The issue I had with saving at the end is that the navigation properties that are set in the innermost loop (Reviewee, SurveyFormID, SectionDetail) are in the loop scope. When I do a save changes outside the loop, the SurveyAnswer gets saved, but all of the aforementioned foreign keys are null. – Tom Kreamer Aug 19 '15 at 23:41

2 Answers2

3

There is several points that you can improve :

  1. Do not SaveChanges() on each Add() :

    foreach (TeamMember TeamMember in TeamMembers)
    {
        ...
        // db.SaveChanges();
    
        foreach (TeamMember peer in TeamMembers)
        {
            foreach (SurveySectionDetail SectionDetail in sectionDetails)
            {
                foreach (SurveyAttributeDetail AttributeDetail in attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == SectionDetail.SurveySectionDetailID) )
                {
                    ...
                    // db.SaveChanges();
                }
            }
        }
    
        db.SaveChanges();
    }
    
  2. Consider to reduce the numbers of round trips to the database. This can be done by : they are memory-intensive
    • using Include() to preload your navigation properties; or
    • cashing the partial or whole table with ToDictionary() or ToLookup()
  3. Instead of Add(), use AddRange() or even BulkInsert() from EntityFramework.BulkInsert if that fits your setup :

    db.SurveyAnswers.AddRange(
        TeamMembers.SelectMany(p => 
            sectionDetails.SelectMany(s => 
                attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == s.SurveySectionDetailID)
                    .Select(a => new SurveyAnswer()
                    {
                        Reviewee = p,
                        SurveyFormID = form.SurveyFormID,
                        Detail = a,
                        SectionDetail = s,
                    }))));
    
Xiaoy312
  • 14,292
  • 1
  • 32
  • 44
  • Thanks for the suggestion! I tried your method, and ended up moving that SaveChanges outside the loop entirely, by building a collection of answers inside the loop, then performing an AddRange(). This works, but unfortunately only cuts the 8-person run time by 10 seconds (from 103 to 93). – Tom Kreamer Aug 19 '15 at 17:58
  • I guess most of the time is spent on the insert part because `Add()` perform one `INSERT` command per entity. Same thing with `AddRange()`. See if you can use EntityFramework.BulkInsert in your setup. – Xiaoy312 Aug 19 '15 at 18:02
  • Oops, just realized I added that change in the wrong place: Changing to an AddRange() outside the loop actually reduced runtime from 103 seconds to 71 seconds. Not bad! – Tom Kreamer Aug 19 '15 at 18:13
2

Use Include to avoid SELECT N + 1 issue.

SurveyTemplate template = db.SurveyTemplates.Include("SurveyTemplateSections")
             .Include("SurveyTemplateSections.SurveyTemplateAttributes")
             .First(x=> x.SurveyTemplateID == SurveyTemplateID);

Generate the whole object graph and then save to DB.

List<SurveySectionDetail> sectionDetails = new List<SurveySectionDetail>();
List<SurveyAttributeDetail> attributeDetails = new List<SurveyAttributeDetail>();
foreach (SurveyTemplateSection section in template.SurveyTemplateSections)
{
   SurveySectionDetail SectionDetail = new SurveySectionDetail();
   //Some code
   sectionDetails.Add(SectionDetail);

   foreach (SurveyTemplateAttribute attribute in section.SurveyTemplateAttributes)
   {
        SurveyAttributeDetail AttributeDetail = new SurveyAttributeDetail();
        //some code
        attributeDetails.Add(AttributeDetail);
   }
}
db.SurveySectionDetails.AddRange(sectionDetails);
db.SurveyAttributeDetails.AddRange(attributeDetails);
db.SaveChanges();

Load all employees you want before the loop, this will avoids database query for every team member.

var teamMemberIds =  db.TeamMembers.Where(m => m.TeamID == TeamID && m.OnTeam)
    .Select(x=>x.TeamMemberId).ToList(); 
var employees = db.Employees.Where(x => teamMemberIds.Contains(x.EmployeeId));

create a dictionary for attributeDetails based on their sectionDetailId to avoid query the list on every iteration.

var attributeDetailsGroupBySection = attributeDetails.GroupBy(x => x.SectionDetailId)
       .ToDictionary(x => x.Key, x => x);

Move saving of SurveyAnswers and SurveyForms to outside of the loops:

List<SurveyForm> forms = new List<SurveyForm>();
List<SurveyAnswer> answers = new List<SurveyAnswer>();
foreach (int teamMemberId in teamMemberIds)
{
    var employee = employees.First(x => x.Id == teamMemberId);
    SurveyForm form = new SurveyForm();
    //some code
    forms.Add(form);
    foreach (int peer in teamMemberIds)
    {
         foreach (SurveySectionDetail SectionDetail in sectionDetails)
         {
              foreach (SurveyAttributeDetail AttributeDetail in 
                         attributeDetailsGroupBySection[SectionDetail.Id])
              {
                   SurveyAnswer answer = new SurveyAnswer();
                   //some code
                   answers.Add(answer);
              }
         }
    }
}
db.SurveyAnswers.AddRange(answers);
db.SurveyForms.AddRange(forms);
db.SaveChanges();

Finally if you want faster insertions you can use EntityFramework.BulkInsert. With this extension, you can save the data like this:

db.BulkInsert(answers);
db.BulkInsert(forms);
Community
  • 1
  • 1
Taher Rahgooy
  • 6,528
  • 3
  • 19
  • 30