I am hunting a bug in our project, where we sometimes see data being saved to the database even though the exceptions occur and everything should be rolled back. I have asked about it here before, and found that if I query the database for @@trancount
right after creating a new TransactionScope
, and I get 0
as result, I do not have a "valid" transaction. It is somehow aborted/rolled back by another thread.
The code I use to reproduce is really simple:
I call this method in a Paralell.For
a few thousand times
void Handle()
{
try
{
using (var transaction = new TransactionScope(TransactionScopeOption.Required)
{
// Getting @@trancount right here enables us verify whether we have a valid transaction.
WriteImportantBusinessDataToDatabase("This will sometimes be committed even if the WCF call below fails!");
// Making a transactional WCF call
var serviceClient = new Service1Client("WSHttpBinding_IService1");
serviceClient.DoWork(); // We get an exception here
WriteImportantBusinessDataToDatabase("We never reach this location");
transaction.Complete();
}
}
catch (Exception e)
{
Logger.Error(e, "Something failed. {message}", e.Message);
}
}
The WCF service is transactional, with TransactionFlowOption.Mandatory
.
[ServiceContract]
public interface IService1
{
[OperationContract]
[TransactionFlow(TransactionFlowOption.Mandatory)]
void DoWork();
}
[ServiceBehavior]
public class Service1 : IService1
{
[OperationBehavior(TransactionScopeRequired = true)]
public void DoWork()
{
throw new Exception();
}
}
@@trancount
is expected to be 1
right after creating a new TransactionScope
. But sometimes
@@trancount
is 0
, and in even more rare cases 2
. The System.Transactions.Transaction.Current.TransactionInformation.Status
property is always Active
, no matter what @@trancount
is. Inserts into the database are executed just fine. The problem occurs then the local transaction is being promoted to a distributed transaction by the WCF call. Then we get an exception saying "The transaction has aborted."
The method name Handle
correctly suggests that this really is a NServiceBus handler, handling hundreds of thousands of messages day and night. We get this inconsistency issue only when there are issues on the other side of the WCF call.
A runnable version of this setup can be found on my GitHub.
I have reproduced the problem with all .Net framework version from 4.5.1 to 4.8. All of them yields the same results. Are anyone able to explain what is going on here?