3

I have four tables: Users, Songs, Genres, and FavoriteGenres.

Users are allowed to favorite genres and when they navigate to their dashboard, they are presented with a list of songs that are within any of their favorite genres.

I have the following EF core code which accomplishes this:

dbContext
    .Songs
    .Include(song => song.Genre)
    .Where(song => song.Genre.FavoriteGenres.Any(favoriteGenre => favoriteGenre.UserId == userId))
    .OrderBy(song => song.CreatedDateTimeOffset)
    .ToListAsync(cancellationToken);

The query that is output (which works fine) is:

SELECT [s].[Id], [s].[Title], [s].[CreatedDateTimeOffset], [g].[Id], [g].[Name]
FROM [Songs] AS [s]
INNER JOIN [Genres] AS [g] ON [s].[GenreId] = [g].[Id]
WHERE EXISTS (
  SELECT 1
  FROM [FavoriteGenres] AS [f]
  WHERE ([g].[Id] = [f].[GenreId]) AND ([f].[UserId] = @__userId_0))
ORDER BY [s].[CreatedDateTimeOffset] DESC

I was just wondering if it is possible to hint at EF core to generate the following query, which I believe may be more optimal (though I would love to be told I am wrong!):

SELECT [s].[Id], [s].[Title], [s].[CreatedDateTimeOffset], [g].[Id], [g].[Name]
FROM [Songs] AS [s]
INNER JOIN [Genres] AS [g] ON [s].[GenreId] = [g].[Id]
INNER JOIN [FavoriteGenres] AS [f] ON [f].[GenreId] = [g].[Id]
WHERE [f].[UserId] = @__userId_0
ORDER BY [s].[CreatedDateTimeOffset] DESC

The execution plans look very similar, but quite honestly I am not an expert at reading them. Any sort of help would be very appreciated :)

Anthony Iacono
  • 203
  • 1
  • 9
  • 1
    https://stackoverflow.com/questions/227037/can-i-get-better-performance-using-a-join-or-using-exists#:~:text=The%20first%20method%20generally%20performs%20faster%20than%20the%20NOT%20EXISTS%20method%20though.&text=In%20cases%20like%20above%20the,the%20records%20must%20be%20used. – CodingYoshi Sep 20 '20 at 14:29
  • @CodingYoshi Thank you! Since the execution plans are so similar, and based on the information in your link, I'm thinking I can keep as is. I will keep my question up just in case someone has an answer to the direct question (ie: hinting at EF core to use an INNER JOIN). I really appreciate your help! – Anthony Iacono Sep 20 '20 at 14:36
  • 2
    EFC query translation cannot be "hinted". EFC responsibility is to create valid SQL returning the correct result described by the LINQ query. It's database query optimizers (CBO) responsibility of choosing the optimal execution plan. Controlling the execution plan the way you write SQL was used in the past when databases had RBOs rather than CBOs. Nowadays whether you use `join` or correlated `where` should not matter, same for `exists (...)` vs `in (select ...)` etc. – Ivan Stoev Sep 20 '20 at 14:47

2 Answers2

4

While the comment by @IvanStoev is correct and it should no be necessary for the query to be rewritten, in case you still care about different query generation, a simple way is to just reorder your table includes. Since you would not end up with songs as the root entities but favorite genres instead, you might want to rearrange the returned entities:

var result = context.FavoriteGenres
    .Include(favoriteGenre => favoriteGenre.Genre)
    .ThenInclude(genre => genre.Songs)
    .Where(favoriteGenre => favoriteGenre.UserId == userId)
    .AsEnumerable() // <-- switch to client-evaluation to return songs instead of favorite genres
    .SelectMany(favoriteGenre => favoriteGenre.Genre.Songs)
    .OrderBy(song => song.CreatedDateTimeOffset)
    .ToList();

This will result in the following SQL being generated:

SELECT [f].[UserId], [f].[GenreId], [g].[Id], [g].[Name], [s].[Id], [s].[CreatedDateTimeOffset], [s].[GenreId], [s].[Title]
FROM [FavoriteGenres] AS [f]
INNER JOIN [Genres] AS [g] ON [f].[GenreId] = [g].[Id]
LEFT JOIN [Songs] AS [s] ON [g].[Id] = [s].[GenreId]
WHERE [f].[UserId] = @__userId_0
ORDER BY [f].[UserId], [f].[GenreId], [g].[Id], [s].[Id]
lauxjpn
  • 4,749
  • 1
  • 20
  • 40
0

EXISTS is almost always more performant, however, SQL Server is smart enough to generate the same query plan for both query strings in most cases so you'd just be changing the text that you send to SQL Server but not really affecting the query plan that is generated once it gets there.

However if you really wanted to, you might be able to get it to generate the query you want with a .Include of the Favorites, and then if you don't want Favorite columns in the results .Select only the columns you want with something like:

dbContext
    .Songs
    .Include(song => song.Genre)
        .ThenInclude(genre => genre.FavoriteGenres)
    .Where(song => song.Genre.FavoriteGenres.Any(favoriteGenre => favoriteGenre.UserId == userId))
    .Select(song => new {
        song.Id,
        song.Title,
        song.Genre.Id,
        song.Genre.Name
    })
    .OrderBy(song => song.CreatedDateTimeOffset)
    .ToListAsync(cancellationToken);

OR

dbContext
    .Songs
    .Include(song => song.Genre)
        .ThenInclude(genre => genre.FavoriteGenres.Select(favoriteGenre => favoriteGenre.UserId == userID)
    .Select(song => new {
        song.Id,
        song.Title,
        song.Genre.Id,
        song.Genre.Name
    })
    .OrderBy(song => song.CreatedDateTimeOffset)
    .ToListAsync(cancellationToken);
Nick Fotopoulos
  • 531
  • 5
  • 15