3

I have never heard of it, but people are refering to an issue in an application as an "N+1 problem". They are doing a Linq to SQL based project, and a performance problem has been identified by someone. I don't quite understand it - but hopefully someone can steer me.

It seems that they are trying to get a list of obects, and then the Foreach after that is causing too many database hits:

From what I understand, the second part of the source is only being loaded in the forwach.

So, list of items loaded:

var program = programRepository.SingleOrDefault(r => r.ProgramDetailId == programDetailId);

And then later, we make use of this list:

foreach (var phase in program.Program_Phases)
{
    phase.Program_Stages.AddRange(stages.Where(s => s.PhaseId == phase.PhaseId));
    phase.Program_Stages.ForEach(s =>
    {
         s.Program_Modules.AddRange(modules.Where(m => m.StageId == s.StageId));
    });
    phase.Program_Modules.AddRange(modules.Where(m => m.PhaseId == phase.PhaseId));
}

It seems the problem idetified is that, they expected 'program' to contain it's children. But when we refer to the child in the query, it reloads the proram:

program.Program_Phases

They're expecting program to be fully loaded and in memory, and profilder seems to indicate that program table, with all the joins is being called on each 'foreach'.

Does this make sense?

(EDIT: I foind this link: Does linq to sql automatically lazy load associated entities? This might answer my quetion, but .. they're using that nicer (where person in...) notation, as opposed to this strange (x => x....). So if this link Is the answer - i.e, we need to 'join' in the query - can that be done?)

Community
  • 1
  • 1
Craig
  • 18,074
  • 38
  • 147
  • 248

2 Answers2

5

In ORM terminology, the 'N+1 select problem' typically occurs when you have an entity that has nested collection properties. It refers to the number of queries that are needed to completely load the entity data into memory when using lazy-loading. In general, the more queries, the more round-trips from client to server and the more work the server has to do to process the queries, and this can have a huge impact on performance.

There are various techniques for avoiding this problem. I am not familiar with Linq to SQL but NHibernate supports eager fetching which helps in some cases. If you do not need to load the entire entity instance then you could also consider doing a projection query. Another possibility is to change your entity model to avoid having nested collections.

mtraudt
  • 103
  • 4
3

For performant linq first work out exactly what properties you actually care about. The one advantage that linq has performance-wise is that you can easily leave out retrieval of data you won't use (you can always hand-code something that does better than linq does, but linq makes it easy to do this without creating a library full of hundreds of classes for slight variants of what you leave out each time).

You say "list" a few times. Don't go obtaining lists if you don't need to, only if you'll re-use the same list more than once. Otherwise working one item at a time will perform better 99% of the time.

The "nice" and "strange" syntaxes as you put it are different ways to say the same thing. There are some things that can only be expressed with methods and lambdas, but the other form can always be expressed as such - indeed is after compilation. The likes of from b in someSource where b.id == 21 select b.name becomes compiled as someSource.Where(b => b.id == 21).Select(b => b.name) anyway.

You can set DataContext.LoadOptions to define exactly which entities you want loaded with which object (and best of all, set it differently in different uses quite easily). This can solve your N+1 issue.

Alternatively you might find it easer to update existing entities or insert new ones, by setting the appropriate id yourself.

Even if you don't find it easier, it can be faster in some cases. Normally I'd say go with the easier and forget the performance as far as this choice goes, but if performance is a concern here (and you say it is), then that can make profiling to see which of the two works better, worthwhile.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251