2

I'm trying to write a LINQ/Entity Framework projection that will pull back two fields from a distant relation. For all the purposes of this question, I have these structure:

PlayedGame

  • int Id (PK)
  • List<PlayerGameResult> PlayerGameResults

Player

  • int Id (PK)
  • string Name

PlayerGameResult

  • PlayedGame PlayedGame (FK to PlayedGame)
  • Player Player (FK to Player)

For a given played game, I want to fetch the Name and Player Id of one arbitrary Player that was in that played game. In reality my example is a little bit more complicated, but I'm leaving out the details since they are a distraction.

The following projection is the best I could come up with:

 var result = dataContext.GetQueryable<PlayedGame>()
    .Where(playedGame => playedGame.Id == somePlayedGameId)
    .Select(x => new
        {
            Name = x.PlayerGameResults.FirstOrDefault() != null ? x.PlayerGameResults.FirstOrDefault().Player.Name : null,
            Id = x.PlayerGameResults.FirstOrDefault() != null ? x.PlayerGameResults.FirstOrDefault().Player.Id : 0
    })
    .FirstOrDefault();

... but the underlying SQL that is generated is moderately terrifying to me:

SELECT
    [Limit5].[Id] AS [Id], 
    [Limit5].[C1] AS [C1], 
    [Limit5].[C2] AS [C2]
FROM 
    (SELECT TOP (1) 
         [Project11].[Id] AS [Id], 
         CASE WHEN ([Project11].[C1] IS NOT NULL) THEN [Project11].[Name] END AS [C1], 
         CASE WHEN ([Project11].[C2] IS NOT NULL) THEN [Project11].[C3] ELSE 0 END AS [C2]
     FROM 
         (SELECT 
        [Project9].[Id] AS [Id], 
        [Project9].[Name] AS [Name], 
        [Project9].[C1] AS [C1], 
        [Project9].[C2] AS [C2], 
        (SELECT TOP (1) 
            [Extent6].[PlayerId] AS [PlayerId]
            FROM [dbo].[PlayerGameResult] AS [Extent6]
            WHERE [Project9].[Id] = [Extent6].[PlayedGameId]) AS [C3]
        FROM ( SELECT 
            [Project8].[Id] AS [Id], 
            [Project8].[Name] AS [Name], 
            [Project8].[C1] AS [C1], 
            [Project8].[C2] AS [C2]
            FROM ( SELECT 
                [Project6].[Id] AS [Id], 
                [Project6].[Name] AS [Name], 
                [Project6].[C1] AS [C1], 
                (SELECT TOP (1) 
                    [Extent5].[Id] AS [Id]
                    FROM [dbo].[PlayerGameResult] AS [Extent5]
                    WHERE [Project6].[Id] = [Extent5].[PlayedGameId]) AS [C2]
                FROM ( SELECT 
                    [Project5].[Id] AS [Id], 
                    [Extent4].[Name] AS [Name], 
                    [Project5].[C1] AS [C1]
                    FROM   (SELECT 
                        [Project3].[Id] AS [Id], 
                        [Project3].[C1] AS [C1], 
                        (SELECT TOP (1) 
                            [Extent3].[PlayerId] AS [PlayerId]
                            FROM [dbo].[PlayerGameResult] AS [Extent3]
                            WHERE [Project3].[Id] = [Extent3].[PlayedGameId]) AS [C2]
                        FROM ( SELECT 
                            [Project2].[Id] AS [Id], 
                            [Project2].[C1] AS [C1]
                            FROM ( SELECT 
                                [Extent1].[Id] AS [Id], 
                                (SELECT TOP (1) 
                                    [Extent2].[Id] AS [Id]
                                    FROM [dbo].[PlayerGameResult] AS [Extent2]
                                    WHERE [Extent1].[Id] = [Extent2].[PlayedGameId]) AS [C1]
                                FROM [dbo].[PlayedGame] AS [Extent1]
                                WHERE [Extent1].[Id] = @p__linq__0
                            )  AS [Project2]
                        )  AS [Project3] ) AS [Project5]
                    LEFT OUTER JOIN [dbo].[Player] AS [Extent4] ON [Project5].[C2] = [Extent4].[Id]
                )  AS [Project6]
            )  AS [Project8]
        )  AS [Project9]
    )  AS [Project11]
)  AS [Limit5]

Is there a "better" way to write this projection? In other words, how can I write this query in LINQ / Entity Framework to only pull back these two fields from the Player table in a fashion that will produce a decent/sane query? I won't pretend to know for certain that the above query is terrible -- but it just doesn't seem in the ballpark to me.

Looking forward to your thoughts!

jakejgordon
  • 4,008
  • 7
  • 36
  • 45

2 Answers2

1

The entire LINQ statement is translated into SQL, which means that you don't have to worry about null checks. SQL doesn't have a concept of null reference (we might say, it had null propagation right from the start!). So your statement can be simplified to:

var result = dataContext.GetQueryable<PlayedGame>()
    .Where(player => player.Id == testPlayerWithNoPlayedGames.Id)
    .Select(x => new
    {
        Name = x.PlayerGameResults.FirstOrDefault().Player.Name,
        Id = (int?)x.PlayerGameResults.FirstOrDefault().Player.Id
    }).FirstOrDefault();

Query syntax can make it look less complex:

var result = (from playerGame in dataContext.GetQueryable<PlayedGame>()
              where playerGame.Id == testPlayerWithNoPlayedGames.Id
              let player = x.PlayerGameResults.FirstOrDefault().Player
              select new
              {
                  Name = player.Name,
                  Id = (int?)player.Id
              }).FirstOrDefault();
Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
  • At first I didn't believe you. My unit tests are blowing up... however, after digging into this quite a bit, you are in fact right. When running this query using EntityFramework directly against the DB this generates no errors! However, when setting up the data in a List and then querying the IQueryable (via .AsQueryable()), this actually blows up. So pure LINQ and EntityFramework behave differently in this regard! Here is a gist to demonstrate the difference: https://gist.github.com/jakejgordon/668ded9d18e7aaf513dfdf9d825c2d14 – jakejgordon Mar 20 '17 at 12:26
  • 1
    I've given up mocking/faking EF [long time ago](http://stackoverflow.com/a/13352779/861716). Only integration tests show the truth. – Gert Arnold Mar 20 '17 at 12:51
1

If I understand you right, you have many-to-many relation between Player and PlayedGame via PlayerGameResult table.

What you want to achieve is to get any relation from PlayerGameResults table for specific player.

Why not doing just that? For this you have to query PlayerGameResults table filtering by its player like this:

var result = dataContext.GetQueryable<PlayerGameResults>()
    .Where(player => player.Player.Id == testPlayerWithNoPlayedGames.Id)
    .Select(x => new {
        Name = x.Player.Name,
        Id = x.Player.Id,
    })
    .FirstOrDefault();

UPDATE based on comments discussion

As we're working with database basically, not with C# (while still writing I would say "SQL with C# tokens"), we have to think on database terms.

That's why it is essential to understand that you are selecting specific row from a table, and basically based on what you described it is logic to select row from many-to-many relation table PlayerGameResult

Sample from the accepted answer is a good "bad" sample, so I will be referring to gist you shared based on it (slightly formatting):

.Select(x => new AchievementRelatedPlayedGameSummary
{
    //--only pull records where the Player had rank -42 (i.e. none of the PlayerGameResults)
    WinningPlayerName = x.PlayerGameResults.FirstOrDefault(y => y.GameRank == -42).Player.Name,
    WinningPlayerId = x.PlayerGameResults.FirstOrDefault(y => y.GameRank == -42).Player.Id
})

Here we twice referring to (as we might think) the same row x.PlayerGameResults.FirstOrDefault(y => y.GameRank == -42).Player, but reality is that separate SQL will be generated for each of that lines. Samples again from your gist.

This one for the Player.Name (results in [Project2].[Name]):

OUTER APPLY  (SELECT TOP (1) 
    [Extent2].[PlayerId] AS [PlayerId]
    FROM [dbo].[PlayerGameResult] AS [Extent2]
    WHERE ([Filter1].[Id] = [Extent2].[PlayedGameId]) AND (-42 = [Extent2].[GameRank]) ) AS [Limit1]
    LEFT OUTER JOIN [dbo].[Player] AS [Extent3] ON [Limit1].[PlayerId] = [Extent3].[Id]) AS [Project2]

And this one for Player.Id (results in [Limit2].[PlayerId] - see how EF optimizes that to not to join to table):

OUTER APPLY  (SELECT TOP (1) 
    [Extent4].[PlayerId] AS [PlayerId]
    FROM [dbo].[PlayerGameResult] AS [Extent4]
    WHERE ([Project2].[Id] = [Extent4].[PlayedGameId]) AND (-42 = [Extent4].[GameRank]) ) AS [Limit2]

And here comes devil - as we have un-ordered SQL, SELECT TOP(1) will return any record it like to. There's no guarantee at all that these two results will be based on the same row (thanks to EF - here we have results even from different tables!), so you might receive Id from one Player, and Name from another Player. Result depends on indexes existence and statistics accumulated, db load, SQL server version and its overall mood. You just cannot be sure - that's it.

Of course I don't know your particular data model, and maybe filtering by GameRank will always return single row no matter what - but that is something you should not base your assumptions on anyway, in general.

So I still suggest base you query on entity you select, something like this one

var result = dataContext.GetQueryable<PlayerGameResults>()
    .Where(x => x.PlayedGame.Id == playedGameId && x.GameRank == -42)
    .Select(x => new AchievementRelatedPlayedGameSummary
    {
        WinningPlayerName = x.Player.Name,
        WinningPlayerId = x.Player.Id
    }).FirstOrDefault();
Lanorkin
  • 7,310
  • 2
  • 42
  • 60
  • your idea is fair, and I think I led you down the wrong path with the variable names I had in my original example. I'm actually starting with a given PlayedGame.Id, not a Player.Id -- so I don't think I can start from PlayerGameResult since there are potentially many PlayedGameResult records per PlayedGame -- and I only want a single row returned for a given PlayedGame – jakejgordon Mar 20 '17 at 11:35
  • @jakejgordon that just means another condition in `Where` part to actually find this particular `PlayerGameResulst` record - what am I missing here? – Lanorkin Mar 20 '17 at 11:38
  • maybe you can clarify in your answer? – jakejgordon Mar 20 '17 at 16:47
  • Thanks @Lanorkin -- a very interesting point about the FirstOrDefault() possibly returning different a different Player's information in both spots. I'll take steps to address that. In regards to the final query -- I still need to pull back PlayedGame information and other stuff regardless of whether the PlayerGameResult exists. I thought this was unnecessary noise to include in the original question but I guess not. In other words, I should have said that we need to assume that we have to start with the PlayedGame table. Sorry about that and thanks for your effort! – jakejgordon Mar 28 '17 at 20:33