0

I am hunting a bug in our project, where we sometimes see data being saved to the database even though the exceptions occur and everything should be rolled back. I have asked about it here before, and found that if I query the database for @@trancount right after creating a new TransactionScope, and I get 0 as result, I do not have a "valid" transaction. It is somehow aborted/rolled back by another thread.

The code I use to reproduce is really simple:

I call this method in a Paralell.For a few thousand times

void Handle()
{
    try
    {
        using (var transaction = new TransactionScope(TransactionScopeOption.Required)
        {
            // Getting @@trancount right here enables us verify whether we have a valid transaction.

            WriteImportantBusinessDataToDatabase("This will sometimes be committed even if the WCF call below fails!");

            // Making a transactional WCF call
            var serviceClient = new Service1Client("WSHttpBinding_IService1");
            serviceClient.DoWork(); // We get an exception here

            WriteImportantBusinessDataToDatabase("We never reach this location");

            transaction.Complete();
        }
    }
    catch (Exception e)
    {
        Logger.Error(e, "Something failed. {message}", e.Message);
    }
}

The WCF service is transactional, with TransactionFlowOption.Mandatory.

[ServiceContract]
public interface IService1
{
    [OperationContract]
    [TransactionFlow(TransactionFlowOption.Mandatory)]
    void DoWork();
}

[ServiceBehavior]
public class Service1 : IService1
{
    [OperationBehavior(TransactionScopeRequired = true)]
    public void DoWork()
    {
        throw new Exception();
    }
}

@@trancount is expected to be 1 right after creating a new TransactionScope. But sometimes @@trancountis 0, and in even more rare cases 2. The System.Transactions.Transaction.Current.TransactionInformation.Status property is always Active, no matter what @@trancount is. Inserts into the database are executed just fine. The problem occurs then the local transaction is being promoted to a distributed transaction by the WCF call. Then we get an exception saying "The transaction has aborted."

The method name Handle correctly suggests that this really is a NServiceBus handler, handling hundreds of thousands of messages day and night. We get this inconsistency issue only when there are issues on the other side of the WCF call.

A runnable version of this setup can be found on my GitHub.

I have reproduced the problem with all .Net framework version from 4.5.1 to 4.8. All of them yields the same results. Are anyone able to explain what is going on here?

stalskal
  • 1,181
  • 1
  • 8
  • 16

1 Answers1

0

This is caused by a bug somewhere in the connection sharing mechanism of .Net. The bug is reproduced on all .Net Framework version between 4.5.1 and 4.8, and also on the new System.Data.SqlClient package.

An issue has been added to the System.Data.SqlClient repository.

stalskal
  • 1,181
  • 1
  • 8
  • 16