22

Using Entity Framework, I received a number of the following exceptions last night in one of my applications:

System.Data.EntityException: The underlying provider failed on Commit. ---> 
System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior 
to completion of the operation or the server is not responding.     
    at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection) 
    at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning()     
    at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
    at System.Data.SqlClient.TdsParserStateObject.ReadSni(DbAsyncResult asyncResult, TdsParserStateObject stateObj)
    at System.Data.SqlClient.TdsParserStateObject.ReadNetworkPacket()     
    at System.Data.SqlClient.TdsParserStateObject.ReadBuffer()     
    at System.Data.SqlClient.TdsParserStateObject.ReadByte()     
    at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler,SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)     
    at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
    at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)     
    at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
    at System.Data.SqlClient.SqlInternalTransaction.Commit()     
    at System.Data.SqlClient.SqlTransaction.Commit()
    at System.Data.EntityClient.EntityTransaction.Commit()     
    --- End of inner exception stack trace ---     
    at System.Data.EntityClient.EntityTransaction.Commit()     
    at System.Data.Objects.ObjectContext.SaveChanges(SaveOptions options)     
    at System.Data.Entity.Internal.InternalContext.SaveChanges()     
    at System.Data.Entity.Internal.LazyInternalContext.SaveChanges()

What's interesting about this error is that the data was actually written to the database. I found a related post on a MS site that seemed to indicate that this was a network related error.

A few questions that I could use assistance on are:

  1. What options do I have to trouble shoot this error?
  2. Is it more than likely network related or could the DB be a suspect?
  3. How can I tell from the code whether the transaction really did complete?
  4. Should I query the DB on this error to check if success or simply retry the transaction?
  5. If I do retry the transaction, how can this be accomplished automatically with Entity Framework or do I simply catch/retry?
  6. What other items should I be looking at?

Thanks in advance.

UPDATE

Using Ignite for SQL we were able to determine that a secondary SQL process from another group was monopolizing the CPU preventing our application from functioning properly. In short, we're moving forward with adding a secondary server to prevent further conflicts between the two teams.

What's still interesting about the exception is that the transaction actually succeeded rather than failed.

Randy
  • 1,955
  • 2
  • 15
  • 24
  • I know it's been a long time, but can you elaborate on how you used Ignite for SQL? In my case it does not always happen so it's hard to know when to watch the server. – gitsitgo Aug 30 '13 at 19:36
  • Can't say that I'm an ignite guru; we have a DB performance team who handles this stuff. I can say that what I typically am looking for when I get performance issues are top sql statements (as ignite states "the big bars"). Those are typically the queries hogging everything. From there it's a matter of tuning those queries. Sorry that I couldn't provide more specifics that you are looking for. Good luck. – Randy Aug 31 '13 at 15:41

2 Answers2

2

My bet is that the success response from the transaction commit command was not sent (or not sent fast enough) causing an exception in your code. A kinda crazy edge case. Exceptions of this kind dont necessarily mean that the actual execution of the command failed just that there was A failure.

In the same way if there was a problem sending the response from a webservice call it wouldn't necessarily imply that any side effects of that call were not applied.

undefined
  • 33,537
  • 22
  • 129
  • 198
  • Luke, Thanks for your comment. You definitely have the right idea. I've seen the same thing in WS code where the WS processed the request but the response timed out. Drives me nuts when this occurs as you have no way to "know" if things actually worked; that is unless you want to code up services that can be called to verify that things actually did work. ;-) – Randy Jun 04 '12 at 14:54
  • 1
    @Randy its actually a very similar thing to in computer networks with TCP. The principal is that if we dont get confirmation that a message is delivered we retry it until we do, however this means that the message (in this case the database transaction) has to be idempotent (ie replayable) which isn't the case with database transactions – undefined Jun 04 '12 at 20:41
1

+1 for Luke, the explanation is good. The wording of error is unfortunate.

System.Data.EntityException: **The underlying provider failed on Commit.** ---> 
System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior 

should be read

System.Data.EntityException: **The underlying provider failed to respond to Commit**
System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior 

the likely causes are Network OR Server issues. e.g. 100 CPU, or other server delay, are all still correct. BUT you do NOT KNOW if it has committed or not, if this is a timeout case. If a response was received with failed, The DB should have rolled back. Of course if that didn't happen, then the DB has hummm crashed and resulted in potential corruption. Rare i hope.

I have seen in a 1 billion+ row table... During space allocation under growth, as index and data area needs expanding, take more than 30 secs. BUT the COMMIT DID happen. Client had timed out. Online re-orgs can cause such delays as well (well I've seen that on DB2 at least)

JP Chapleau
  • 103
  • 1
  • 5
phil soady
  • 11,043
  • 5
  • 50
  • 95