1

I have developed a WCF Web Service that is called from several SharePoint Online workflows. At certain points there could be around 4 users starting up to 10 workflows within a very short time frame: one workflow could possibly make as much as 3 requests to the web service. Needless to say, at certain points, the WCF Service becomes overloaded. When SharePoint workflows make HTTP web service calls and the service is unavailable, the workflow runs into an error and attempts to restart the workflow after a short period of time: which only contributes to making things worse.

These are some of the exceptions logged today from the web service during an approximate 40 minute of "overloading":

Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.

The underlying connection was closed: An unexpected error occurred on a receive.

The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.

I have tried to look into ways to avoid the WCF web service from malfunctioning when several requests are being made and besides the obvious actions of finding ways to decrease the amount of calls made to the web service (which is not always an option), I came into the terms: WCF Concurrency Modes and Throttling Limits.

Given the scenario described above, could anyone guide me into the right direction as to which Concurrency Mode and Throttling limits would be most ideal? Presently, my WCF service has default configuration.

Concurrency Modes can be:

Single or Multiple or Reentrant

Throttling Limit options are shown below:

<serviceThrottling maxConcurrentCalls="Integer"  
maxConcurrentInstances="Integer"  
maxConcurrentSessions="Integer" />  

I am still quite new to this area of programming and am finding it a tad complicated, so any help would be greatly appreciated!

Update: The SharePoint system is highly customised and it covers a Business process that is quite complicated. The Web Service methods are varied and it would take me a long time to explain what every method does but I will mention some examples. The web service is used for operations that either cannot be done (easily or at all) using out of the box SharePoint designer actions. For example: moving documents and copying metadata from one folder to another (in the same or different lists), syncing information between lists/libraries, calculating values based on metadata of several documents living within a given folder, scheduling data into an external database to be used with other components such as a console application running as a scheduled task, etc.

The web service calls take an average of 2 minutes to execute and return a value. The fastest methods take around 30 seconds, and the slowest around 4 minutes. Both the slow and fast methods are frequently utilised.

Jurgen Cuschieri
  • 658
  • 14
  • 33
  • Can you provide more information on what kind of operations this webservice is doing, and how much time these operations take? – Tiago Sousa Sep 27 '17 at 16:48
  • @TiagoSousa most certainly. I have updated the question to provide more information as you requested. Thanks in advance! – Jurgen Cuschieri Sep 27 '17 at 21:12

1 Answers1

0

Your problem could be caused by a number of things, and you need to gather more information in order for anyone to be helpful to you.

With that said, the best I can do here is give you some pointers on how to gather such information, such as:

  • Turn on WCF tracing and try to understand when does the error occur on Sharepoint side. Does the error occur while the webservice is processing the request, after, or does it never receive the request in the first place?
  • If this tracing doesn't give you much answers, write code in your webservices to Trace specific messages to give you more information on what the webservice is doing and what it is receiving/returning from/to Sharepoint, or use your preferred logging library.
  • In specific cases, the EventViewer might have some information on what is happening. Check for any messages that show up at a similar time of when the error occurs on the client.

At last, relaxing your serviceThrottling settings might mitigate some of your issues, but won't solve them. If you have alot of I/O operations in your webservices (access to Databases, Filesystem or other Webservices) you might improve your webservices performance by using asynchronous I/O, using the TPL framework. If you are returning a lot of data from your webservice (like a big object, an object with cyclic references, or a big file), this might be also the reason why the server is forcing the connections to be closed.

Hope this helps you in solving your issue.

Tiago Sousa
  • 953
  • 5
  • 14
  • Hi Tiago, could the mentioned errors be a result of an overload on the web service causing requests to be queued for a long-time and SharePoint ends up losing connection with the web service because of a timeout? With regards to your answer, thanks a lot. I have tried to install the Microsoft Tool to enable WCF tracing, however, I did not manage. The "fixes" suggest removing the Microsoft Visual C++ Redistributable, which is a risk I won't take on a live server. – Jurgen Cuschieri Sep 28 '17 at 11:25
  • The Event Viewer did not show any specific errors, but gave me some insight on a client making unnecessary requests to the problematic web service. I will use this information to try avoid such scenarios in the future. As with regards to your suggestions of writing "tracing code", I will be looking into it. Hopefully it will provide me with useful logs – Jurgen Cuschieri Sep 28 '17 at 11:28
  • I'm not sure if WCF or IIS will drop connections or return a 503 Service Unavailable Error when these thresholds in serviceThrottling are violated, but you surely would see something on your Trace. If you are unable to install the Trace viewer, try to redirect your [Trace to a file of your choosing](https://msdn.microsoft.com/en-us/library/system.diagnostics.textwritertracelistener(v=vs.110).aspx), by simply changing your configuration. – Tiago Sousa Sep 28 '17 at 11:37
  • Even so, you need to narrow down on the issue by gathering more info. At this stage, I don't think anyone can help you with the little information you have. – Tiago Sousa Sep 28 '17 at 11:39
  • I will try to gather more information and update / ask a new question then – Jurgen Cuschieri Sep 28 '17 at 11:48
  • In the link you shared "Trace to a file of your choosing", I do understand that you have to first enable tracing and then configure in the config, but I dont understand the example which is using code to write into a text file. Isn't this supposed to be an automatic process that logs information into a log file when problems arise? Or is it the same thing as writing code in the service to trace (reference to the link you shared in your initial question)? If it is the latter case, where should the Trace.Write() method be used? In the exceptions? Or sporadically within the web service methods? – Jurgen Cuschieri Sep 28 '17 at 12:01
  • 1
    Trace is a common stream where the .net writes trace messages. You can, by using the `System.Diagnostics.Trace` type, also write custom messages to this common stream. To see these .net messages and yours, you need to place listeners on this stream so that these messages get written somewhere (the last link I gave you explains this in detail, including what you need to change in your web.config file to make this happen) – Tiago Sousa Sep 28 '17 at 12:08
  • If this has helped you please vote the answer and mark it as the correct answer. This way, it can help others with the same issue. – Tiago Sousa Sep 28 '17 at 12:53