0

In my scenario, I have a method called GetEntries that invokes another method called Get within a transaction scope. Get method contains three LINQ queries. The first query retrieves a single collection, the second query also retrieves a single collection, and the third query performs a group join using both collections.

Pseudo code

internal static class AuditTest
{
    public static IEnumerable<Audit> Get(
        DataContext dataContext,
        Int32 id,
        Int32[] recordIds)
    {
        var collection1 =
            dataContext
            .Table1
            .Where(d => d.MemberId == id && recordIds.Contains(d.RecordId))
            .Select(
                record =>
                    new
                    {
                        ......
                    })
            .ToArray();

        var collection2 =
            dataContext
            .Table2
            .Where(r => r.MemberId == id && recordIds.Contains(r.RecordId))
            .Select(
                record =>
                    new
                    {
                        ......
                    })
            .ToArray();

        return
            collection1
            .GroupJoin(
                collection2,
                record1 => record1.RecordId,
                record2 => record2.RecordId,
                (record1, record1WithRecord2) =>
                    new Audit
                    {
                        .......
                    });
    }

    public static IEnumerable<Entry> GetEntries(
        DataContext dataContext,
        Int32 Id,
        Int32[] Ids)
    {
        var option = new TransactionOptions()
        {
            IsolationLevel = IsolationLevel.ReadCommitted,
            Timeout = TransactionManager.MaximumTimeout
        };

        using (var transactionScope = new TransactionScope(
            TransactionScopeOption.Suppress, option))
        {
            return
                Get(dataContext, Id, Ids)
                .Select(
                    m =>
                        new Entry
                        {
                            Guid = m.Guid,
                            Id = m.Id,
                        });
        }
    }
}

The Get method is called inside a TransactionScope

    public static IEnumerable<Entry> GetEntries(
        DataContext dataContext,
        Int32 Id,
        Int32[] Ids)
    {
        var option = new TransactionOptions()
        {
            IsolationLevel = IsolationLevel.ReadCommitted,
            Timeout = TransactionManager.MaximumTimeout
        };

        using (var transactionScope = new TransactionScope(
            TransactionScopeOption.Suppress, option))
        {
            return
                Get(dataContext, Id, Ids)
                .Select(
                    m =>
                        new Entry
                        {
                            Guid = m.Guid,
                            Id = m.Id,
                        });
        }
    }
}

Upon investigation, it appears that the first query executed with the expected isolation level, which is "ReadCommitted." However, the second query unexpectedly executed with the "Serializable (4)" isolation level. This change in isolation level caused a timeout exception, as reported in the blocked-process-report.

To determine why the isolation level changed, I'm unable to pinpoint the exact cause at the moment. However, this issue has occurred sporadically in our production server.

Why it happens and how to reproduce it in a local machine? How to fix it?

Exception:

SqlException "Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding."

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at System.Data.SqlClient.SqlDataReader.get_MetaData()
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
   at System.Data.Linq.SqlClient.SqlProvider.Execute(Expression query, QueryInfo queryInfo, IObjectReaderFactory factory, Object[] parentArgs, Object[] userArgs, ICompiledSubQuery[] subQueries, Object lastResult)
   at System.Data.Linq.SqlClient.SqlProvider.ExecuteAll(Expression query, QueryInfo[] queryInfos, IObjectReaderFactory factory, Object[] userArguments, ICompiledSubQuery[] subQueries)
   at System.Data.Linq.SqlClient.SqlProvider.System.Data.Linq.Provider.IProvider.Execute(Expression query)
   at System.Data.Linq.DataQuery`1.System.Collections.Generic.IEnumerable<T>.GetEnumerator()
   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
   .......

Blocked process report:

<blocked-process-report>
<blocked-process>
    <process taskpriority="0" logused="0" waitresource="KEY: 11:72057594321895424 (67f1cd66ae3e)" waittime="7078" ownerId="26395737" transactionname="SELECT" lasttranstarted="2023-02-02T09:40:11.237" lockMode="RangeS-S" schedulerid="1" kpid="6632" status="suspended" spid="77" sbid="0" ecid="0" priority="0" trancount="0" lastbatchstarted="2023-02-02T09:40:11.237" lastbatchcompleted="2023-02-02T09:40:11.240" lastattention="1900-01-01T00:00:00.240" clientapp=".Net SqlClient Data Provider" isolationlevel="serializable (4)" xactid="26395737" currentdb="11" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
        <executionStack>
            <frame line="1" stmtstart="50" stmtend="2520"/>
            <frame line="1"/>
        </executionStack>
        <inputbuf>{{Query 1}}</inputbuf>
    </process>
</blocked-process>
<blocking-process>
    <process status="sleeping" spid="74" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2023-02-02T09:40:11.210" lastbatchcompleted="2023-02-02T09:40:11.210" lastattention="1900-01-01T00:00:00.210" clientapp=".Net SqlClient Data Provider" isolationlevel="read committed (2)" xactid="26390106" currentdb="11" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
        <executionStack/>
        <inputbuf>{{Query 2}}</inputbuf>
    </process>
</blocking-process>
Ram Nivas
  • 142
  • 3
  • 13
  • Can you post the code for the transaction including where you set the IsolationLevel? – Peter Dongan Jun 19 '23 at 15:26
  • 3
    When you do `.ToArray();` you are fetching the entire collection to the client side, and then I believe your `GroupJoin()` is also going to be done on the client side. Why do that? See e.g. [LINQ query has bad performance while putting .ToArray() inside from/in](https://stackoverflow.com/q/65670660). – dbc Jun 19 '23 at 15:38
  • Agreed there is no readon to use `ToArray` here – Charlieface Jun 19 '23 at 15:45
  • Is there any place in your code where the transaction isolation level is, in fact, set to serializable? One notable source of trouble is that, unlike many other settings, the transaction isolation level is *not* reset to the default for pooled connections when they're returned to the pool. This means that if you use a non-standard isolation level *anywhere*, you'd better be explicit about it *everywhere*. – Jeroen Mostert Jun 19 '23 at 17:41
  • @peter, I only set the isolation level inside GetEntries method. – Ram Nivas Jun 19 '23 at 18:27
  • @Jeroen, while executing the second ToArray, it automatically changed to serializable. – Ram Nivas Jun 19 '23 at 18:29
  • @dbc, So can I remove the ToArray to avoid fetching in client side? I will try that. Is there a way to reproduce the issue in local? – Ram Nivas Jun 19 '23 at 18:32
  • 1
    @RamNivas - I don't currently have an environment to test it (which is why I added a comment instead of an answer), but I believe you can remove the `ToArray()` calls. Give it a try and see if the performance improves. Note sure about reproducing the issue in local. – dbc Jun 19 '23 at 18:34
  • 1
    Reading further, in your `Where` clause `d => d.Id == Id && Ids.Contains(d.Id)`, the expression `Ids.Contains(d.Id)` is effectively **constant**. Since we know `d.Id == Id` the value is the same as `Ids.Contains(Id)`. Why not just evaluate that outside the database query, and modify your query accordingly? And similarly for `Ids.Contains(r.Id)`. Furthermore, if `Ids.Contains(Id)` is `false`, isn't your query guaranteed to return nothing? Do you really need `Ids` at all? Or did you mean `d.Id == Id || Ids.Contains(d.Id)`? – dbc Jun 19 '23 at 19:34
  • I'm not aware of anything in EF that will automatically switch isolation levels. This is normally done explicitly with a Transaction. It will be hard to narrow down a possible cause without seeing the actual query used in the second scenario that appears to run in Serializable isolation. – Steve Py Jun 19 '23 at 21:07
  • @RamNivas post the *actual* code, not pseudocode. EF and LINQ don't change the isolation level. The code you posted has several problems already, like using Join and GroupJoin instead of letting EF Core use the relations between entities. Calling entities `Table1` instead of `Customers` is a very strong indication that EF is used not as an ORM but as a replacement for SQL – Panagiotis Kanavos Jun 22 '23 at 07:56
  • As for `TransactionScope` ... why? You don't need transactions when *reading* data. `SaveChanges` doesn't need it either, it already uses an internal transaction to save *all* pending changes. That's another strong indication that EF is misused. The blocks are caused by that misuse – Panagiotis Kanavos Jun 22 '23 at 07:58
  • First of all, remove `TransactionScope`. *That* is what starts a long-running transaction, keeping locks for longer than necessary and thus blocking other connections. Which in turn keep *their* locks for longer, delaying the original. The manual `GroupJoin` at the end keeps the database locks even longer. Second, check *which* query times out. It could be that the real query is inefficient or tries to load too much data. EF is an ORM, not an ETL tool. It's meant to load a graph of objects, not a stream of rows from a table. – Panagiotis Kanavos Jun 22 '23 at 08:02
  • What are the table schemas (columns and indexes?). How much data is there? Is anyone else using `Table1` or `Table2` at the same time? Do other parts of the code use explicit transactions? The way the code is written, any locks taken on these tables are kept until `Get` ends. Anything trying to update Table1 will block until then. EF Core normally uses optimistic concurrency and avoids such problems – Panagiotis Kanavos Jun 22 '23 at 08:05
  • @Panagiotis, I am unable to share the original code due to a policy restriction. Moreover, I have used a sample table name instead of the actual name, and the column name "Id" does not represent the exact property. In the modified pseudo code, I mistakenly assigned the same "Id" name to all the properties. I apologize for the confusion caused. – Ram Nivas Jun 22 '23 at 10:15
  • Upon reviewing the blocked-process-report, it appears that the first query (collection1) is identified as the blocking process with an isolation level of read committed. Meanwhile, the second query (collection2) is recognized as the blocked process and utilizes the isolation level of serializable. – Ram Nivas Jun 22 '23 at 10:20
  • Whatever the code is, is no different than dozens if not 100s of similar SO questions. In all cases, it's the bad code that causes blocking. Without the real code we can't tell you which part of the code explicitly uses Serializable - another `TransactionScope` most likely. We *can* tell you that this code *guarantees* blocks, timeouts and deadlocks. The locks taken on `Table1` are maintained while the query on Table2 runs *and* while joining in memory without any indexes. – Panagiotis Kanavos Jun 22 '23 at 10:28
  • @RamNivas - Just speculating here, but if the first query is blocking the second query, it might be because the `ToArray()` is sending the results over the wire which prevents the second query from running. Did you try removing the `ToArray()` calls and doing everything on the server side? (Note this question is independent of Panagiotis Kanavos's observations, which look to be on point.) – dbc Jun 22 '23 at 17:33
  • @dbc - When I removed the ToArray() function, a nested subquery was created. Upon examining the blocked process report, I discovered that the blocking query (the first one) was using the read committed isolation level, while the blocked query was using the serializable isolation level with different SPID, despite my explicit declaration of the read committed isolation level. – Ram Nivas Jun 23 '23 at 08:21
  • In my local environment, both queries used the same SPID and isolation level. However, on the production server, they differed in terms of SPID and isolation level. I'm wondering how the isolation level got changed, considering that EF doesn't switch it automatically. – Ram Nivas Jun 23 '23 at 08:21
  • Here i have edited to include the blocked-process-report for your reference. – Ram Nivas Jun 23 '23 at 09:50

1 Answers1

-2

You can increase the SQL command timeout in LINQ

dataContext.CommandTimeout = 3 * 60; // 3 Mins
Son of Man
  • 1,213
  • 2
  • 7
  • 27