I am running a Windows Service which is making several calls to multiple other WCF services running elsewhere. Sometimes our clients encounter an uncatchable exception when calling to an endpoint which brings down our service. The server side of the endpoint is running an instance which, aside from the database calls it makes, is completely stateless.
The exception is 'System.AccessViolationException' which I am aware falls under the category of corrupted state exceptions, as I have read here.
I am aware of the [HandleProcessCorruptedStateExceptions] attribute, and all of the warnings against using it to handle code that is not also maintained by the catcher of the exception.
The top of the call stack delves into the native windows libraries that WCF wraps:
System.AccessViolationException Stack:
at System.Net.UnsafeNclNativeMethods+OSSOCK.recv(IntPtr, Byte*, Int32, System.Net.Sockets.SocketFlags)
at System.Net.Sockets.Socket.Receive(Byte[], Int32, Int32, System.Net.Sockets.SocketFlags, System.Net.Sockets.SocketError ByRef)
at System.Net.Sockets.Socket.Receive(Byte[], Int32, Int32, System.Net.Sockets.SocketFlags)
at System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32)
at System.Net.PooledStream.Read(Byte[], Int32, Int32)
at System.Net.Connection.SyncRead(System.Net.HttpWebRequest, Boolean, Boolean)
at System.Net.Connection.PollAndRead(System.Net.HttpWebRequest, Boolean)
at System.Net.ConnectStream.WriteHeaders(Boolean)
at System.Net.HttpWebRequest.EndSubmitRequest()
at System.Net.HttpWebRequest.SetRequestSubmitDone(System.Net.ConnectStream)
at System.Net.Connection.CompleteStartRequest(Boolean, System.Net.HttpWebRequest, System.Net.TriState)
at System.Net.Connection.SubmitRequest(System.Net.HttpWebRequest, Boolean)
at System.Net.ServicePoint.SubmitRequest(System.Net.HttpWebRequest, System.String)
at System.Net.HttpWebRequest.SubmitRequest(System.Net.ServicePoint)
at System.Net.HttpWebRequest.GetRequestStream(System.Net.TransportContext ByRef)
at System.Net.HttpWebRequest.GetRequestStream()
at System.ServiceModel.Channels.HttpOutput+WebRequestHttpOutput.GetOutputStream()
at System.ServiceModel.Channels.HttpOutput.Send(System.TimeSpan)
at System.ServiceModel.Channels.HttpChannelFactory`1+HttpRequestChannel+HttpChannelRequest[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].SendRequest(System.ServiceModel.Channels.Message, System.TimeSpan)
at System.ServiceModel.Channels.RequestChannel.Request(System.ServiceModel.Channels.Message, System.TimeSpan)
at System.ServiceModel.Dispatcher.RequestChannelBinder.Request(System.ServiceModel.Channels.Message, System.TimeSpan)
at System.ServiceModel.Channels.ServiceChannel.Call(System.String, Boolean, System.ServiceModel.Dispatcher.ProxyOperationRuntime, System.Object[], System.Object[], System.TimeSpan)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(System.Runtime.Remoting.Messaging.IMethodCallMessage, System.ServiceModel.Dispatcher.ProxyOperationRuntime)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(System.Runtime.Remoting.Messaging.IMessage)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(System.Runtime.Remoting.Proxies.MessageData ByRef, Int32)
at WCFCommunicator.IComEndpoint.TestEndpoint()
at WCFCommunicator.SendingLayer+<>c.<VerifyCurrentEndpoints>b__38_0(WCFCommunicator.IComEndpoint)
at WCFCommunicator.HttpWrapper.Client`2[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].UseCom[[System.Boolean, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]](System.Func`2<System.__Canon,Boolean>, ErrorType ByRef, System.String)
at WCFCommunicator.SendingLayer.VerifyCurrentEndpoints()
at WCFCommunicator.LocalSubscriptionAnalyzer.ValidateEndpoints_Elapsed(System.Object, System.Timers.ElapsedEventArgs)
at System.Timers.Timer.MyTimerCallback(System.Object)
at System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object)
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.TimerQueueTimer.CallCallback() at System.Threading.TimerQueueTimer.Fire() at System.Threading.TimerQueue.FireNextTimers()
at System.Threading.TimerQueue.AppDomainTimerCallback()
The client code is always passed in as a lambda and executed in a try/catch/finally which looks like:
public TReturn UseCom<TReturn>(Func<Com, TReturn> code)
{
Com channel = cFactory.CreateChannel();
bool error = true;
try
{
TReturn result = code(channel);
((IClientChannel)channel).Close();
error = false;
return result;
}
catch (EndpointNotFoundException e)
{
//log and handle
return default(TReturn);
}
catch (FaultException e)
{
//log and handle
return default(TReturn);
}
catch (CommunicationException e)
{
//log and handle
return default(TReturn);
}
catch (Exception e)
{
//log and handle
return default(TReturn);
}
finally
{
if (error)
{
((IClientChannel)channel).Abort();
}
}
}
Which is used like:
var result = clientInstance.UseCom(endpoint => endpoint.TestEndpoint());
This issue has been EXTREMELY hard to reproduce and does not seem to show any other patterns, any help would be greatly appreciated.