4

In my service I have a background thread that does a best effort saving of a stream of object of certain entity type. Code roughly is following:

        while (AllowRun)
        {
            try
            {
                using (DbContext context = GetNewDbContext())
                {
                    while (AllowRun && context.GetConnection().State == ConnectionState.Open)
                    {
                        TEntity entity = null;
                        try
                        {
                            while (pendingLogs.Count > 0)
                            {
                                lock (pendingLogs)
                                {
                                    entity = null;
                                    if (pendingLogs.Count > 0)
                                    {
                                        entity = pendingLogs[0];
                                        pendingLogs.RemoveAt(0);
                                    }
                                }

                                if (entity != null)
                                {
                                    context.Entities.Add(entity);
                                }
                            }
                            context.SaveChanges();
                        }
                        catch (Exception e)
                        {
                            // (1)
                            // Log exception and continue execution
                        }
                    }
                }
            }
            catch (Exception e)
            {
               // Log context initialization failure and continue execution
            }
        }

(this is mostly the actual code, I omitted few non-relevant parts that attempt to keep popped objects in memory until we are able to save stuff to DB again when exception is caught at (1) block)

So, essentially, there is an endless loop, trying to read items from some list and save them to Db. If we detect that connection to DB failed for some reason, it just attempts to reopen it and continue. The issue is that sometimes (I failed to figure out how to reproduce it so far), the code above when context.SaveChanges() is called starts to produce following exception (caught in (1) block):

System.Data.EntityException: An error occurred while starting a transaction on the provider connection. See the inner exception for details. ---> 
System.InvalidOperationException: The requested operation cannot be completed because the connection has been broken.

The error is logged, but when the execution returns to the context.GetConnection().State == ConnectionState.Open check, it evaluates to true. So we are in a state when context reports that its DB connection is open, but we can't run queries against that context. Restarting the service removes the issue (as well as messing with AllowRun variable in debugger to force recreation of context). So the question is since I can't trust context's connection state, how do I verify that I can run queries against DB?

Also, is there a clean way to figure out that connection is not in a "healthy" state? I mean, the EntityException by itself is not an indication that I should reset the connection, only if its InnerException is InvalidOperationException with some specific Message, then yes, it is time to reset it. But, now I guess there would be other situations when ConnectionState indicates that everything is fine, but I can't query DB. Can I catch those proactively, not waiting until it starts to bite me?

n0rd
  • 11,850
  • 5
  • 35
  • 56
  • Can you provide detailed stack trace? Seems that either something transaction messed up ongoing EF connection or experienced connection timeout issue (possibly from MSDTC if distributed transactions being apply). – Tetsuya Yamamoto Feb 28 '17 at 07:32
  • 1
    Most likely non-relevant parts you omited are really relevant. All in all there is no need to hold connection for the lifetime of this operation. The only way you use connection is SaveChanges call (by the way you call it repeatedly if your example even if pendingLogs is empty). Just create new context when pendingLogs is not empty (connection will be returned from pool most often) and dispose after SaveChanges (connection will be returned to the pool). Also check if your transactions escalate to distributed and why (maybe you call multiple databases in one transacton etc). – Evk Feb 28 '17 at 08:45
  • 1
    Why do you get a context with an open connection in the first place? – Gert Arnold Mar 01 '17 at 21:14
  • @TetsuyaYamamoto, I'll post the full stack trace once I find it in the logs, the issue does not happen too often (I've seen it twice, including the one case that caused me to write this). – n0rd Mar 01 '17 at 22:57
  • @GertArnold, that's an interesting question. For some reason, our context creation method(s) always do `ctx = new DbContext(...); ctx.GetConnection().Open()`. Why is it done this way, I can only guess: the code predates me, but my guess would be that we would be able to handle connection failures in one place. – n0rd Mar 01 '17 at 23:04
  • Is there a particular reason you are avoiding recreating the context? Connection pooling means that there is very little performance cost to recreation. – Brandon Kramer Mar 02 '17 at 14:37
  • @BrandonKramer, I don't want to recreate a context/connection on each exception, just seems inefficient. But probably I am worrying too much. I would love to be able to re-establish it on broken connection, but there is no catchable exception for this specific case (I have to check the message of the inner exception, which is ridiculous). – n0rd Mar 02 '17 at 23:22
  • Recreating the context has very little cost, since you are not actually recreating the connection each time. When you close a connection, it just goes back into the [Connection Pool](https://msdn.microsoft.com/en-us/library/8xx3tyca(v=vs.110).aspx), and then when you create another context, one of the existing connections will be used. – Brandon Kramer Mar 03 '17 at 12:10
  • Also, its worth noting that if you call `Open()` on the connection for the `DbContext`, then the context will _never_ call `Close()`. That is, if you call `Open()` then you must also call `Close()`. Which means there should be no reason to check ConnectionState, since it is only set on calls to `Open()` or `Close()`. As a result, you can probably solve your problem by simply dropping the while (AllowRun && context.GetConnection()...) and letting it recreate the context each time. – Brandon Kramer Mar 04 '17 at 12:56
  • is it possible that multiple threads are triggering this code and hence one of the threads is disposing db context which closes underlying connection of the dbcontext (when dbcontext does not own the connection) ? – Gurpreet Mar 05 '17 at 22:33

3 Answers3

1

What is the log frequency?

if this loop take longer than connection timeout, connection closed when savechanges executing.

while (pendingLogs.Count > 0)
{
     lock (pendingLogs)
     {
          entity = null;
          if (pendingLogs.Count > 0)
          {
             entity = pendingLogs[0];
             pendingLogs.RemoveAt(0);
          }
     }

     if (entity != null)
     {
          context.Entities.Add(entity);
     }
}
context.SaveChanges();
levent
  • 3,464
  • 1
  • 12
  • 22
  • "Connection timeout" is the time limit for *establishing* connection. Once it is established, there are no timeouts for keeping it open. Am I wrong? At least I had never seen established SQL connections timing out (outside of query timeouts of course). – n0rd Mar 01 '17 at 23:57
0

From my experience working on similar services, garbage collection won't occur until the end of the using block.

If there were a lot of Pending logs to write, this could use a lot of memory, but I also guess it might starve the dbConnection pool.

You can analyse memory usage using RedGate ANTS or a similar tool, and check dbConnections that are open using the following script from this StackOverflow question: how to see active SQL Server connections?

SELECT 
    DB_NAME(dbid) as DBName, 
    COUNT(dbid) as NumberOfConnections,
    loginame as LoginName
FROM
    sys.sysprocesses
WHERE 
    dbid > 0
GROUP BY 
    dbid, loginame

;

I think it's good practice to free up the context as often as you can in order to give GC a change of cleaning up, so you could rewrite the loop as:

while (AllowRun)
    {
        try
        {
            while (pendingLogs.Count > 0)
            {
                using (DbContext context = GetNewDbContext())
                {
                    while (AllowRun && context.GetConnection().State == ConnectionState.Open)
                    {
                        TEntity entity = null;
                        try
                        {

                            lock (pendingLogs)
                            {
                                entity = null;
                                if (pendingLogs.Count > 0)
                                {
                                    entity = pendingLogs[0];
                                    pendingLogs.RemoveAt(0);
                                }
                            }

                            if (entity != null)
                            {
                                context.Entities.Add(entity);
                                context.SaveChanges();
                            }
                        }                        
                        catch (Exception e)
                        {
                            // (1)
                            // Log exception and continue execution
                        }
                    }
                }
            }
        }
        catch (Exception e)
        {
           // Log context initialization failure and continue execution
        }
    }
Community
  • 1
  • 1
0

I recommend go through below url: Timeout Expired is usually thrown when a sql query takes too long to run.

Sounds like a SQL job is running, backup? That might be locking tables or restarting the service.

ADONET async execution - connection broken error

Community
  • 1
  • 1
Sudhakar singh
  • 104
  • 2
  • 9