2

I am executing the following LINQ to Entities query but it is stuck and does not return response until timeout. I executed the same query on SQL Server and it return 92000 in 3 sec.

            var query = (from r in WinCtx.PartsRoutings
                     join s in WinCtx.Tab_Processes on r.ProcessName equals s.ProcessName
                     join p in WinCtx.Tab_Parts on r.CustPartNum equals p.CustPartNum
                     select new { r}).ToList();

SQL Generated:

SELECT [ I omitted columns]
    FROM   [dbo].[PartsRouting] AS [Extent1]
INNER JOIN [dbo].[Tab_Processes] AS [Extent2] ON ([Extent1].[ProcessName] = [Extent2].[ProcessName]) OR (([Extent1].[ProcessName] IS NULL) AND ([Extent2].[ProcessName] IS NULL))
INNER JOIN [dbo].[Tab_Parts] AS [Extent3] ON ([Extent1].[CustPartNum] = [Extent3].[CustPartNum]) OR (([Extent1].[CustPartNum] IS NULL) AND ([Extent3].[CustPartNum] IS NULL))

PartsRouting Table has 100,000+ records, Parts = 15000+, Processes = 200.

I tried too many things found online but nothing worked for me as to how I can achieve the result with same performance of SQL.

Khalil
  • 119
  • 1
  • 14
  • What are the datatypes of the columns in your database? – Wouter van Vegchel Mar 23 '17 at 10:33
  • String (nvarchar). actually I am moving old database data to new database where every table has primary-FK constraints. Data is being retrieved from PartsRouting table to be inserted into its alternate table where it has key constraints with Tab_Processes and Tab_Parts. – Khalil Mar 23 '17 at 10:44
  • What is the exact exception? Also can we see the generated SQL? – Ivan Stoev Mar 23 '17 at 10:46
  • I will have to wait long for it, but in google searched I copied it as: entity+framework+Timeout+expired.+The+timeout+period+elapsed+prior+to+completion+of+the+operation+or+the+server+is+not+responding – Khalil Mar 23 '17 at 10:49
  • Neither EF nor SQL Server have any such bugs, which is why such a search won't return anything useful. Query performance depends on the data, indexes and the generated execution plan. That's the common answer to all similar questions - in fact this question could be a duplicate of any other that asks "why is my query slow?". I'd even bet that you *don't* have indexes that cover `ProcessName`, resulting in full table scans – Panagiotis Kanavos Mar 23 '17 at 11:00
  • Why are you joining entities *at all* instead of defining relations and using navigation properties? `PartsRouting` should have a `Process` property and `Process` should have `Parts`. The relations should be defined using the primary keys, typically IDs of the entities. – Panagiotis Kanavos Mar 23 '17 at 11:01
  • Finally, why are you searching for all 100K rows? You can't display all of them on a grid or report. Are you trying to bulk export the data? Shouldn't that happen on the server itself? – Panagiotis Kanavos Mar 23 '17 at 11:03
  • If you read my question, I am copying old database which has no relational constraints, to new database where primary keys and foreign keys are setup. – Khalil Mar 23 '17 at 11:07
  • the problematic part of the SQL generated query is "OR" between the query. I wondered how it is translated from Linq query. – Khalil Mar 23 '17 at 11:13
  • But do you have indexes on `ProcessName` and `CustPartNum` columns in the old database? Could you confirm that you executed the above SQL against old database in SSMS and it worked? The `IS NULL` conditions seem redundant and may slow down the query execution. Actually what's the reason for joins - just to filter something that normally FK will enforce? – Ivan Stoev Mar 23 '17 at 11:14
  • Yes it worked in SSMS but without IS NULL.Yes is null is redundant why entity framework has generated this?. – Khalil Mar 23 '17 at 11:16
  • There is no indexes on any table of old database – Khalil Mar 23 '17 at 11:16
  • So you confirm that the problem is caused by additional `OR` with `IS NULL` conditions in joins? If true, there is solution, but not with joins, so you need to confirm that joins are used just for filtering, i.e. the `select` does not include fields from `s` and `p` in your example, is that correct? – Ivan Stoev Mar 23 '17 at 11:27
  • This link may be help full [Entity Framework query slow, but same SQL in SqlQuery is fast](http://stackoverflow.com/questions/15767803/entity-framework-query-slow-but-same-sql-in-sqlquery-is-fast) – Abinash Mar 23 '17 at 11:46
  • It does include those to get Primary Key ID for Part and Process, I just for test omitted these. 's' and 'p' are new database tables where i will get primary key to be insert new Tab_PartRoutings table – Khalil Mar 23 '17 at 11:48

1 Answers1

5

Based on the comments, looks like the issue is caused by the additional OR with IS NULL conditions in joins generated by the EF SQL translator. They were added in EF in order to emulate the C# == operator semantics which are different from SQL = for NULL values.

You can start by turning that EF behavior off through UseDatabaseNullSemantics property (it's false by default):

WinCtx.Configuration.UseDatabaseNullSemantics = true;

Unfortunately that's not enough, because it fixes the normal comparison operators, but they simply forgot to do the same for join conditions.

In case you are using joins just for filtering (as it seems), you can replace them with LINQ Any conditions which translates to SQL EXISTS and nowadays database query optimizers are treating it the same way as if it was an inner join:

var query = (from r in WinCtx.PartsRoutings
             where WinCtx.Tab_Processes.Any(s => r.ProcessName == s.ProcessName)
             where WinCtx.Tab_Parts.Any(p => r.CustPartNum == p.CustPartNum)
             select new { r }).ToList();

You might also consider using just select r since creating anonymous type with single property just introdeces additional memory overhead with no advantages.

Update: Looking at the latest comment, you do need fields from joined tables (that's why it's important to not omit relevant parts of the query in question). In such case, you could try the alternative join syntax with where clauses:

WinCtx.Configuration.UseDatabaseNullSemantics = true;
var query = (from r in WinCtx.PartsRoutings
             from s in WinCtx.Tab_Processes where r.ProcessName == s.ProcessName
             from p in WinCtx.Tab_Parts where r.CustPartNum == p.CustPartNum
             select new { r, s.Foo, p.Bar }).ToList();
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • Great.. Worked `SELECT [Extent2].[ProcessID] AS [ProcessID], [Extent3].[PartID] AS [PartID] FROM [dbo].[PartsRouting] AS [Extent1] INNER JOIN [dbo].[Tab_Processes] AS [Extent2] ON [Extent1].[ProcessName] = [Extent2].[ProcessName] INNER JOIN [dbo].[Tab_Parts] AS [Extent3] ON [Extent1].[CustPartNum] = [Extent3].[CustPartNum]` – Khalil Mar 23 '17 at 12:28
  • The SaveChanges command is also terribly slow as oppose to SQL Insert INTO, Can you advise workaround for this? – Khalil Mar 23 '17 at 12:39
  • Search for 3rd party packages supporting EF batch insert. But `INSERT INTO ... SELECT ..` cannot be beaten, you might consider doing such data move operations directly in the database or using the corresponding database tools. – Ivan Stoev Mar 23 '17 at 13:03