2

Hey everyone i have a working LINQ query but i was wondering if it could be optimized even more..

this is what i'm doing right now:

var selectionsAnimals = await this.context.SelectionAnimals
                                        .GroupBy(a => a.AnimalId)
                                        .ToListAsync();

var animalsSelections = selectionsAnimals
                         .Select(a => new AnimalsSelection
                                 {
                                     AnimalId = a.First().AnimalId,
                                     SelectionIds = a.Select(b => b.SelectionId)
                                                   .OrderBy(b => b)
                                                   .ToList()
                                 })
                         .ToList();

I was wondering if these two LINQ statements could be merged and as one awaitable LINQ query to the database?

This is the Database entity:

public class SelectionAnimals
{
    public int Id { get; set; }
    public int AnimalId { get; set; }
    public int SelectionId { get; set; }
    public DateTime CreatedAt { get; set; }
    public DateTime UpdatedAt { get; set; }
}

This is the model:

public class AnimalsSelection
{
    public int AnimalId { get; set; }
    public IList<int> SelectionIds { get; set; }
}

in the end i'm mapping animalsSelection to a IEnumerable<AnimalsSelection> (the model)

Evk
  • 98,527
  • 8
  • 141
  • 191
Baklap4
  • 3,914
  • 2
  • 29
  • 56

3 Answers3

4

Michael Coxon is right in general, but EF Core currently is quite unstable in even basic query construct processing.

So (sadly) the most important question is what EF Core version are you using. It's really a trial and error process. For instance, in EF Core 1.1.0 (release), Michael's construct does not work but the following equivalent does:

var animalsSelections = await db.SelectionAnimals
    .GroupBy(a => a.AnimalId)
    .Select(g => new AnimalsSelection
    {
        AnimalId = g.Key,
        SelectionIds = g.OrderBy(a => a.SelectionId)
                        .Select(a => a.SelectionId)
                        .ToList()
    })
    .ToListAsync();
Community
  • 1
  • 1
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • Unfortunately I can't test on 1.0.0 here, let me know if the above works. – Ivan Stoev Nov 23 '16 at 11:42
  • Ahh.. so is it the reversed order of the `OrderBy` and `Select` since it is on the real object not a bunch of `int`'s? – Michael Coxon Nov 23 '16 at 11:42
  • @MichaelCoxon I have no idea why. Both are equivalent and should translate to one and the same. Bugs, bugs ... – Ivan Stoev Nov 23 '16 at 11:45
  • @Baklap4 since you care about performance, and there might be millions of rows - don't forget to check what sql query is generated by this LINQ query, it might be not what you would expect. – Evk Nov 23 '16 at 11:59
  • @Evk EF Core has mixed execution concept. You'll probably be surprised that all the solutions (OP, mine, yours, Michael's) generate one and the same SQL: `SELECT [a].[Id], [a].[AnimalId], [a].[CreatedAt], [a].[SelectionId], [a].[UpdatedAt] FROM [SelectionAnimals] AS [a] ORDER BY [a].[AnimalId]` – Ivan Stoev Nov 23 '16 at 12:26
  • No I'm not surprised, because that's exactly why I warn @Baklap4 to take care about which query is generated. He might be surprised that there is no GROUP BY statement in that query at all for example. Many people are sure that EF LINQ query always executes entirely by database and there are no internal in-memory operations by EF itself (what you call mixed execution I suppose) at all. In this case, perfomance of this query when number of SelectionAnimals will go from 34 to millions (in production) might be very surprising (and you can do better with raw sql query of course). – Evk Nov 23 '16 at 12:34
  • @Evk EF Core Roadmap says they are working on. So some day it would be as the people expect. Until then, the documentation has a section called *is it right for you* :) – Ivan Stoev Nov 23 '16 at 12:37
  • I'm not going to touch it with a ten meter pole at the current state :) But thing is - EF 6 behaves in a similar way on some complex queries. For similar query it will generate of course not THAT bad sql query, but still surprising (no group by also for example). And maybe it cannot generate better even (as stated in answer by Euphoric). – Evk Nov 23 '16 at 12:42
1

GroupBy returns an IQueryable so I am pretty sure there is nothing stopping you from doing...

var animalsSelections = await this.context.SelectionAnimals
                        .GroupBy(a => a.AnimalId)
                        .Select(a => new AnimalsSelection
                        {
                            AnimalId = a.First().AnimalId,
                            SelectionIds = a.Select(b => b.SelectionId)
                                            .OrderBy(b => b)
                                            .ToList()
                        })    
                        .ToListAsync();      
Michael Coxon
  • 5,311
  • 1
  • 24
  • 51
  • This is exactly what i've tried. Yet this gives me an Error: viewable at this link: http://pastebin.com/0aQGqNcB I think this is due to the fact of the list within a list (by creating a new object of AnimalsSelection (the model)) since this list is not created async. – Baklap4 Nov 23 '16 at 10:54
  • ok - well change your model to be `public IEnumerable SelectionIds { get; set; }` and get rid of the `ToList` in the `Select` – Michael Coxon Nov 23 '16 at 11:07
  • I've changed the model and got rid of the `ToList` in the `Select` Yet still the same. As SelectionId's is now a IEnumerable it can't still be used for parameter of type: `System.Collections.Generic.IAsyncEnumerable`1[System.Int32]` – Baklap4 Nov 23 '16 at 11:15
  • @MichaelCoxon *there is nothing stopping you from doing* True in general, but there is something called EF Core where (currently) general rules do not apply :( – Ivan Stoev Nov 23 '16 at 11:28
  • @IvanStoev So this is an EFCore thing? Because I swear I have done this in EF6 before... I could be mistaken.. – Michael Coxon Nov 23 '16 at 11:30
  • Yeah, EF Core currently is a big disappointment (and EF6 regression). – Ivan Stoev Nov 23 '16 at 11:40
0

There really isn't faster way than getting all AnimalId and SelectionId and then grouping them on the client. It could be slightly improved by ordering by AnimalId and then using custom group-by that assumes the items are sorted by key. This might actually be required if you are also ordering the SelectionIds.

But as you say in your comment, you are going to have millions of records, which means that the query will return all of them.

Another option is to concatenate the SelectionIds into single comma-separated column, but that is outside of power of LINQ, so you have to go through raw SQL.

Community
  • 1
  • 1
Euphoric
  • 12,645
  • 1
  • 30
  • 44