Design question about background processing web service

Question

the title of the question may not clear enough, allow me to explain the background here:

I would like to design a web service that generates PDF and submit it to printer, here is the workflow:

User submit a request to the web service, probably the request will be one off so that user wouldn't suffer from waiting the job complete. User may received a HTTP200 and continue their work.
Once web service received the request, the web service generates the PDF, and submit it to designated printer and this process could take some time and CPU resources. As I don't want the drain all resource on that server, I may use producer consumer pattern here, there might be a queue to to queue client jobs, and process them one by one.

My Questions is that:

I'm new to C#, what is the proper pattern to queue and process them? Should I use ConcurrentQueue and ThreadPool to archive it?
What is the proper way to notify user about the job is success/fail? Instead of using callback service, is async an ideal way? My concern is that there may be lots of jobs in the queue and I don't want client suffer from waiting it complete.
The web service is placed behind a load balancer, how can I maintain a 'process queue' among them? I've tried using Hangfire and it seems okay, however I'm looking for alternative?
How can I know the number of jobs in the Queue/ how may thread is currently running? The webservice will be deployed on IIS, is there a Native way to archive it, or should I implement a web service call to obtain them?

Any help will be appreciated, thanks!

What's the nature of your client? Since you're using wcf, I'm assuming it's not a browser? Full windows app? — Clay, Feb 09 '19 at 19:08
@Clay Sorry for forgetting mention it, yes it’s full desktop app, mostly wpf. — nathan1658, Feb 09 '19 at 19:11

Clay · Answer 1 · 2019-02-09T22:38:41.030

WCF supports the idea of a fire-and-forget methods. You just mark your contract interface method as one way, and there will be no waiting for a return:

[OperationContract( IsOneWay = true )]
void PrintPDF( PrintRequest request );

The only downside, of course, is that you won't get any notification from the server that you're request was successful or even valid. You'd have to do some kind of periodic polling to see what's going on. I guess you could put a Guid into the PrintRequest, so you could interrogate for that job later.

If you're not married to wcf, you might consider signalR...there's a comprehensive sample app of both a server and simple wpf client here. It has the advantage that either party can initiate an exchange once the connection has been established.

If you need to stick with wcf, there's the possibility of doing dualHttp. The client connects with an endpoint to callback to...and the server can then post notifications as work completes. You can get a feel for it from this sample.

Both signalR and wcf dualHttp are pretty straightforward. I guess my preference would be based on the experience of the folks doing the work. signalR has the advantage of playing nicely with browser-based clients...if that ever turns into a concern for you.

As for the queue itself...and keeping with the wcf model, you want to make sure your requests are serializable...so if need be, you can drain the queue and restart it later. In wcf, that typically means making data contracts for queue items. As an aside, I never like to send a boatload of arguments to a service, I prefer instead to make a data contract for method parameters and return types.

Data contracts are typically just simple types marked up with attributes to control serialization. The wcf methods do the magic of serializing/deserializing your types over the wire without you having to do much thinking. The client sends a whizzy and the server receives a whizzy as it's parameter.

There are caveats...in particular, the deserialization doesn't call your constructor (I believe it uses MemberwiseClone instead) ...so you can't rely on the constructor to initialize properties. To that end, you have to remember that, for example, collection types that aren't required might need to be lazily initialized. For example:

[DataContract]
public class ClientState
{
  private static object sync = new object( );

  //--> and then somewhat later...

  [DataMember( Name = "UpdateProblems", IsRequired = false, EmitDefaultValue = false )]
  List<UpdateProblem> updateProblems;
  /// <summary>Problems encountered during previous Windows Update sessions</summary>
  public List<UpdateProblem> UpdateProblems
  {
    get
    {
      lock ( sync )
      {
        if ( updateProblems == null ) updateProblems = new List<UpdateProblem>( );
      }
      return updateProblems;
    }
  }

  //--> ...and so on...

}

Something I always do is to mark the backing variable as the serializable member, so deserialization doesn't invoke the property logic. I've found this to be an important "trick".

Producer/consumer is easy to write...and easy to get wrong. Look around on StackOverflow...you'll find plenty of examples. One of the best is here. You can do it with ConcurrentQueue and avoid the locks, or just go at it with a good ol' simple Queue as in the example.

But really...you're so much better off using some kind of service bus architecture and not rolling your own queue.

Being behind a load balancer means you probably want them all calling to a service instance to manage a single queue. You could roll your own or, you could let each instance manage its own queue. That might be more processing than you want going on on your server instances...that's your call. With wcf dual http, you may need your load balancer to be configured to have client affinity...so you can have session-oriented two-way communications. signalR supports a message bus backed by Sql Server, Redis, or Azure Service Bus, so you don't have to worry about affinity with a particular server instance. It has performance implication that are discussed here.

I guess the most salient advice is...find out what's out there and try to avoid reinventing the wheel. By all means, go for it if you're in burning/learning mode and can afford the time. But, if you're getting paid, find and learn the tools that are already in the field.

Since you're using .Net on both sides, you might consider writing all your contracts (service contracts and data contracts) into a .DLL that you use on both the client and the service. The nice thing about that is it's easy to keep things in sync, and you don't have to use the (rather weak) generated data contract types that come through WSDL discovery or the service reference wizard, and you can spin up client instances using ChannelFactory<IYourServiceContract>.

Appreciate for your informative and detailed advice! I'll read the reference about it. Another idea came up in my mind is that, If I want user to be able to query about their submitted job status, instead of making the WCF call one way, should I return some kind of ticket that reference to the job ID? Also Is there any example project that did the similar thing? Thanks! — nathan1658, Feb 10 '19 at 08:36
Yeah, getting a job Id back might be more appropriate, especially if you're just queueing the request on the server. It's possible to allow the client to generate the job Id...if you use something like Guid.NewGuid(), for example...in that it'll be unique. If the server produces the Id, it would probably have to go all the way to the database to get a number, but that's unlikely to be too expensive. Not a big deal either way. — Clay, Feb 10 '19 at 13:38

Design question about background processing web service

1 Answers1