0

I'm creating this kind of tree data structure with 4 or 5 levels from related collections of monthly volume data for car manufacturer, model, and engines etc. It's taking up to a minute to build it though. Is there a faster way of doing it?

var carData = (from manufacturer in manufacturerMonths.Select(m => m.Manufacturer).Distinct()
                  select new
                  {
                      ManufacturerData = (from manufacturerMonth in manufacturerMonths
                      .Where(t => t.Manufacturer == manufacturer)
                         select new
                         {
                            Date = manufacturerMonth.Date,
                            Volume = manufacturerMonth.Volume,
                            Models = new 
                            {
                               ModelsData = (from model in modelMonths
                               .Where(m => m.Manufacturer == manufacturer)
                               .Select(m => m.Model).Distinct()                                                                                                                                                             
                                  select new                                                                                                                                                              
                                  {
                                     ModelData = (from modelMonth in modelMonths
                                     .Where(m => m.Model == model)                                                                                            
                                        select new
                                        {
                                            Date = modelMonth.Date,
                                            Volume = modelMonth.Volume,
                                            Engines  = new       
                                            {
                                               EnginesData = (from engine in engineMonths
                                               .Where(e => e.Model == model)
                                               .Select(e => e.Engine).Distinct()
                                                  select new 
                                                  {
                                                     EngineData = ....
                                                  }
                                            }
                                        }
                                   }
                             }
                        }
                    }
               }
Willis
  • 161
  • 1
  • 12
  • How many objects in the `manufacturerMonths` ? If you try do write all this stuff into separate operations what operation takes the most time? You can also profile the code with profiler to get what takes the most time – Sergey Litvinov Jun 10 '15 at 09:15
  • What type is `modelMonth`? `engineMonths`? `manufacturerMonths`? – Lasse V. Karlsen Jun 10 '15 at 11:48

2 Answers2

1

The inner linq statements are executed for each item of outer statements. It is like nested foreach loops. So make sure that you do not process the same data over and over again. It may improve performance when using dictionaries:

E.g. (just to show what I mean, don't kill me if it's wrong ...)

Instead of

                                           EnginesData = (from engine in engineMonths
                                           .Where(e => e.Model == model)
                                           .Select(e => e.Engine).Distinct()
                                              select new 
                                              {
                                                 EngineData = ....
                                              }

Create a dictionary at the beginning:

Dictionary<Model, Engine[]> modelToEnginesDict = engineMonths
  .GroupBy(e => e.Model)
  .ToDictionary(
    x => x.Key,
    x => x.Select(e => e.Engine).Distinct().ToArray())

and use it in the linq statement:

EnginesData = (from engine in modelToEnginesDict[model]
  select new 
  {
     EngineData = ....
  }

At the same time, you split the huge query into smaller pieces which do some pre-processing of the data.

Stefan Steinegger
  • 63,782
  • 15
  • 129
  • 193
0

So much of LINQ :O

IMHO, try with hierarchy of classes with "having" inheritance.

LINQ is Lazy initialized and as will cost your performance overheads. See: Is a LINQ statement faster than a 'foreach' loop?

If you can write your own custom code for iterating over List where you can break-out of loop on some condition (if this is really possible with your design). It will be a bit of pain, but over a longer time duration it will give you benefits.

Also code will be more easy to debug, having this deep a LINQ with "new" (Anonymous) objects will be slow and pain to check if something goes away from expectation.

Community
  • 1
  • 1
Bhanu Chhabra
  • 474
  • 1
  • 6
  • 22