3

We are using Entity Framework and we want to perform a simple LEFT JOIN.

In fact, this is the SQL that we would like to have in LINQ (Queryable):

SELECT 
     cbl.ID as ClaimBatchLine_ID
    ,cbl.PurchasePrice
    ,c.*
    ,ic.DueDate
    ,ic.Reference
FROM ClaimBatchLine cbl
INNER JOIN Claim c ON c.ID = cbl.CLaim_ID
LEFT JOIN InvoiceClaim ic ON ic.ID = c.ID
WHERE cbl.ClaimBatch_ID = @claimBatchId
ORDER BY cbl.ID
OFFSET (@recordsPerPage*@page) ROWS
FETCH NEXT @recordsPerPage ROWS ONLY

What we came up with is this:

from cbl in ClaimBatchLines where cbl.ClaimBatch_ID == 1
from c in Claims where c.ID == cbl.Claim_ID
from ic in InvoiceClaims.DefaultIfEmpty() where ic.ID == c.ID
select new {cbl, c, ic.Reference}

And that produces the following SQL.

SELECT [t0].[ID],
     [t0].[ClaimBatch_ID],
     [t0].[Claim_ID],
     [t0].[PurchasePrice],
     [t1].[ID] AS [ID2],
     [t1].[ClaimType_ID],
     [t1].[Debtor_ID],
     [t1].[CustomerContractRevision_ID],
     [t1].[Date],
     [t1].[CreatedOn],
     [t1].[GrossAmount],
     [t1].[OpenAmount],
     [t1].[IsProcessedByOpenAmountCalculator],
     [t1].[RowVersion],
     [t2].[Reference] AS [Reference]
FROM [ClaimBatchLine] AS [t0]
    CROSS JOIN [Claim] AS [t1]
                   LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON 1 = 1
WHERE([t2].[ID] = [t1].[ID])
    AND ([t1].[ID] = [t0].[Claim_ID])
    AND ([t0].[ClaimBatch_ID] = @p0);

It produces the same result set. So that is great. However, as you can see the LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON 1 = 1 is not what we want. I wish it would have translated it to LEFT JOIN InvoiceClaim ic ON ic.ID = c.ID instead.

Are we doing something wrong? Or is LINQ to SQL just sub optimal (with respect to performance) and not able to understand what we want.

EDIT: In LINQPad this results in some nice query

from cbl in ClaimBatchLines
join c in Claims on cbl.Claim_ID equals c.ID
join ic in InvoiceClaims on c.ID equals ic.ID into g
from e in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new {cbl, c, e.Reference}

namely

-- Region Parameters
DECLARE @p0 INT= 1;
-- EndRegion
SELECT [t0].[ID],
     [Columns left out for brevity]
     [t2].[Reference] AS [Reference]
FROM [ClaimBatchLine] AS [t0]
    INNER JOIN [Claim] AS [t1] ON [t0].[Claim_ID] = [t1].[ID]
    LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON [t1].[ID] = [t2].[ID]
WHERE [t0].[ClaimBatch_ID] = @p0;

but when adding the paging functionality like so:

(from cbl in ClaimBatchLines
join c in Claims on cbl.Claim_ID equals c.ID
join ic in InvoiceClaims on c.ID equals ic.ID into g
from e in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new {cbl, c, e.Reference})
.OrderBy(a => a.cbl.ID)
.Skip(0 * 15000)
.Take(15000)

It produces this 'monster':

-- Region Parameters
DECLARE @p0 INT= 1;
DECLARE @p1 INT= 0;
DECLARE @p2 INT= 15000;
-- EndRegion
SELECT [t4].[ID],
     [Columsn left out for brevity...]
FROM
(
   SELECT ROW_NUMBER() OVER(ORDER BY [t3].[ID]) AS [ROW_NUMBER],
        [Columsn left out for brevity...]
   FROM
   (
      SELECT [t0].[ID],
            [Columsn left out for brevity...]
      FROM [ClaimBatchLine] AS [t0]
          INNER JOIN [Claim] AS [t1] ON [t0].[Claim_ID] = [t1].[ID]
          LEFT OUTER JOIN [InvoiceClaim] AS [t2] ON [t1].[ID] = [t2].[ID]
   ) AS [t3]
   WHERE [t3].[ClaimBatch_ID] = @p0
) AS [t4]
WHERE [t4].[ROW_NUMBER] BETWEEN @p1 + 1 AND @p1 + @p2
ORDER BY [t4].[ROW_NUMBER];

And even worse. When I execute that same LINT-To-EF not via LINQpad but in code using EF repositories I get this even bigger monster:

SELECT [Project5].[ID] AS [ID],
     [Columns left out for brevity...]
     [Project5].[Reference] AS [Reference]
FROM
(
   SELECT [Extent1].[ID] AS [ID],
        [Extent1].[Claim_ID] AS [Claim_ID],
        [Extent1].[ClaimBatch_ID] AS [ClaimBatch_ID],
        [Extent1].[PurchasePrice] AS [PurchasePrice],
        [Join4].[Id1] AS [ID1],
        [Join4].[ClaimType_ID] AS [ClaimType_ID],
        [Join4].[Debtor_ID] AS [Debtor_ID],
        [Join4].[CustomerContractRevision_ID] AS [CustomerContractRevision_ID],
        [Join4].[Date] AS [Date],
        [Join4].[GrossAmount] AS [GrossAmount],
        [Join4].[OpenAmount] AS [OpenAmount],
        [Join4].[CreatedOn] AS [CreatedOn],
        [Join4].[IsProcessedByOpenAmountCalculator] AS [IsProcessedByOpenAmountCalculator],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN '2X'
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN '2X0X'
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN '2X1X'
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN '2X2X'
            ELSE '2X3X'
        END AS [C1],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS BIT)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN [Join4].[IsAppeared]
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS BIT)
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN CAST(NULL AS BIT)
        END AS [C2],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS TINYINT)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS TINYINT)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN [Join4].[AdjustmentClaimReason_ID]
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN CAST(NULL AS TINYINT)
        END AS [C3],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN [Join4].[User_ID]
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN CAST(NULL AS INT)
        END AS [C4],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN [Join4].[CostClaimAnnouncement_ID]
        END AS [C5],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS DECIMAL(19, 4))
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS DECIMAL(19, 4))
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS DECIMAL(19, 4))
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN [Join4].[DiscountFactor]
        END AS [C6],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS DATETIME2)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS DATETIME2)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS DATETIME2)
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN [Join4].[DiscountValidTo]
        END AS [C7],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN [Join4].[AppliedDiscountAdjustmentClaim_ID]
        END AS [C8],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS INT)
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN [Join4].[ExpiredDiscountAdjustmentClaim_ID]
        END AS [C9],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS VARCHAR(1))
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS VARCHAR(1))
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS VARCHAR(1))
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN CAST(NULL AS VARCHAR(1))
            ELSE [Join4].[Reference]
        END AS [C10],
        CASE
            WHEN((NOT(([Join4].[C11] = 1)
                    AND ([Join4].[C11] IS NOT NULL)))
                AND (NOT(([Join4].[C12] = 1)
                       AND ([Join4].[C12] IS NOT NULL)))
                AND (NOT(([Join4].[C13] = 1)
                       AND ([Join4].[C13] IS NOT NULL)))
                AND (NOT(([Join4].[C14] = 1)
                       AND ([Join4].[C14] IS NOT NULL))))
            THEN CAST(NULL AS DATETIME2)
            WHEN(([Join4].[C14] = 1)
                AND ([Join4].[C14] IS NOT NULL))
            THEN CAST(NULL AS DATETIME2)
            WHEN(([Join4].[C12] = 1)
                AND ([Join4].[C12] IS NOT NULL))
            THEN CAST(NULL AS DATETIME2)
            WHEN(([Join4].[C11] = 1)
                AND ([Join4].[C11] IS NOT NULL))
            THEN CAST(NULL AS DATETIME2)
            ELSE [Join4].[DueDate]
        END AS [C11],
        [Extent7].[Reference] AS [Reference]
   FROM [dbo].[ClaimBatchLine] AS [Extent1]
       INNER JOIN
   (
      SELECT [Extent2].[Id] AS [Id1],
            [Columns left out for brevity...]
      FROM [dbo].[Claim] AS [Extent2]
          LEFT OUTER JOIN
      (
         SELECT [Extent3].[Id] AS [Id],
               [Extent3].[CostClaimAnnouncement_ID] AS [CostClaimAnnouncement_ID],
               [Extent3].[DiscountFactor] AS [DiscountFactor],
               [Extent3].[DiscountValidTo] AS [DiscountValidTo],
               [Extent3].[AppliedDiscountAdjustmentClaim_ID] AS [AppliedDiscountAdjustmentClaim_ID],
               [Extent3].[ExpiredDiscountAdjustmentClaim_ID] AS [ExpiredDiscountAdjustmentClaim_ID],
               CAST(1 AS BIT) AS [C1]
         FROM [dbo].[CostClaim] AS [Extent3]
      ) AS [Project1] ON [Extent2].[Id] = [Project1].[Id]
          LEFT OUTER JOIN
      (
         SELECT [Extent4].[Id] AS [Id],
               [Extent4].[IsAppeared] AS [IsAppeared],
               CAST(1 AS BIT) AS [C1]
         FROM [dbo].[InterestClaim] AS [Extent4]
      ) AS [Project2] ON [Extent2].[Id] = [Project2].[Id]
          LEFT OUTER JOIN
      (
         SELECT [Extent5].[Id] AS [Id],
               [Extent5].[AdjustmentClaimReason_ID] AS [AdjustmentClaimReason_ID],
               [Extent5].[User_ID] AS [User_ID],
               CAST(1 AS BIT) AS [C1]
         FROM [dbo].[AdjustmentClaim] AS [Extent5]
      ) AS [Project3] ON [Extent2].[Id] = [Project3].[Id]
          LEFT OUTER JOIN
      (
         SELECT [Extent6].[Id] AS [Id],
               [Extent6].[Reference] AS [Reference],
               [Extent6].[DueDate] AS [DueDate],
               CAST(1 AS BIT) AS [C1]
         FROM [dbo].[InvoiceClaim] AS [Extent6]
      ) AS [Project4] ON [Extent2].[Id] = [Project4].[Id]
   ) AS [Join4] ON [Extent1].[Claim_ID] = [Join4].[Id1]
       LEFT OUTER JOIN [dbo].[InvoiceClaim] AS [Extent7] ON [Join4].[Id1] = [Extent7].[Id]
   WHERE 1 = [Extent1].[ClaimBatch_ID]
) AS [Project5]
ORDER BY [Project5].[ID] ASC
OFFSET 0 ROWS FETCH NEXT 15000 ROWS ONLY;

What the hell is going on here! Scratching on the LINQpad initially looked fine. But the final query in production code is just plain horrible! Maybe these queries are fine with some simple applications. This is not fine to me when querying 500k records among even more records. I will just stick with plain SQL table valued functions instead using LINQ. What a pity.

Mike de Klerk
  • 11,906
  • 8
  • 54
  • 76

3 Answers3

4

Try this way:

from cbl in ClaimBatchLines 
join c in Claims on c.ID equals cbl.Claim_ID
join ic in InvoiceClaims on ic.ID equals c.ID into g
from e in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new {cbl, c, e.Reference}

For more info about how to perform a left join in linq take a look this link

ocuenca
  • 38,548
  • 11
  • 89
  • 102
  • Actually there is no need to check for `null` in L2E queries because SQL naturally handles that. The only thing you need to care is if the property is value type, in which case you need to promote (cast) it to the corresponding nullable type. – Ivan Stoev Feb 23 '17 at 20:21
  • 1
    Got it. Thanks for the info @IvanStoev, every day I learn something new ;) – ocuenca Feb 23 '17 at 20:26
  • 1
    We all do, and with EF Core should start again because it totally changes everything :) – Ivan Stoev Feb 23 '17 at 20:30
  • I had to swap cbl.Claim_ID equals c.ID because I got the message `CS1937 The name 'c' is not in scope on the left side of 'equals'. ` – Mike de Klerk Feb 24 '17 at 06:38
  • This answer resulted in the most concise and clear TSQL. An INNER JOIN on the Claim and a LEFT JOIN on the InvoiceClaim table. Thanks for the great answer. – Mike de Klerk Feb 24 '17 at 06:42
2

The easiest way is to move your where to the join:

from cbl in ClaimBatchLines where cbl.ClaimBatch_ID == 1
from c in Claims where c.ID == cbl.Claim_ID
from ic in InvoiceClaims.Where(x => x.ID == c.ID).DefaultIfEmpty() 
select new {cbl, c, ic.Reference}

This will make the query use an an left join on the InvoiceClaims table

Aducci
  • 26,101
  • 8
  • 63
  • 67
  • This produces a `CROSS JOIN` on Claim. Where ClaimBatchLine always has a 1:1 relation with Claim. From what I understand from this post: http://stackoverflow.com/questions/17759687/cross-join-vs-inner-join-in-sql-server-2008 is that using an INNER JOIN is preferred when there is an intentional relationship between to the two tables. But according to this post http://stackoverflow.com/a/6418253/1567665 the performance is the same. Not the accepted answer. but +1 for a correct answer. – Mike de Klerk Feb 24 '17 at 06:50
1

Try writing it using linq's join.

var q = 
(from cbl in ClaimBatchLines
join c in Claims on cbl.Claim_ID equals c.ID
join tmpIc in InvoiceClaims on c.ID equals tmpIc.ID into g
from ic in g.DefaultIfEmpty()
where cbl.ClaimBatch_ID == 1
select new { cbl, c, ic })
  .OrderBy(x => x.cbl.ID) 
  .Skip(recordsPerPage * page)
  .Take(recordsPerPage);
Magnus
  • 45,362
  • 8
  • 80
  • 118
  • This results in a LEFT JOIN with a subquery that selects a subset of columns from InvoiceClaim table. I am not sure whether it makes a huge performance difference with the accepted answer. But ClaimBatches can be large (500K records) and the simpler the SQL, the better I assume. Thanks for the help though. – Mike de Klerk Feb 24 '17 at 06:45
  • +1 that you where so complete for ordering, skipping and taking. I didn't knew that the query could be demarcated with () and used in combination with other LINQ methods. – Mike de Klerk Feb 24 '17 at 07:01