Articles on how to organize background queue operations

Question

Now I'm thinking about how to organize architecture of system. The system will consists of web site, where user can upload some documents and then get it processed back and a some background daemon with an queue of tasks that should process provided documents.

My question is: Should I implement the daemon I told you above, as a WCF service with only named pipes (no netowork access to this service needed)?

Any suggestions/tips/advices on that?

The data user can provide is just a bunch of XML files. ASP.NET web site will expose functionality to get this XML files and then somehow should be able to pass them to daemon.

Could you point me please on some articles on that topic. Thanks in advance!

POST EDIT

After some hours discovering MSMQ suggested here by guys, my thought on that technology is about that is more for distributed architecture (processing nodes are located on separate machines and there is exchanging messages between differents computers through network).

At the moment separating to independent machines is not needed. There will be just on machine on which being an ASP.NET website and some processing program.

Is that using of MSMQ so necessary?

POST EDIT #2

As I using .NET Framework here, please suggest only offers what are compatible for .NET. There is really no any options here.

It would be prudent to set the WCF up such that you can run the processing and the site on separate boxes - that way onerous background processing won't affect web server performance. — Jon Egerton, Oct 11 '12 at 09:16
Connect the wbesite to the daemon through MSMQ seems more simple to me — lboshuizen, Oct 11 '12 at 09:24
@JonEgerton On current stage of project it's not suitable for us to use separate servers. But thanks for advice! — kseen, Oct 11 '12 at 09:27
@lboshuizen Could you please share some knowledges on that? It seems exactly that I need. — kseen, Oct 11 '12 at 09:27
Even if you don't need the flexibility of distribution a MQ system provides, you should use it just for the convenience it provides. Any self made async/scheduler code will be more bug ridden and more complex than initially expected, and less future proof. — Jürgen Strobel, Oct 22 '12 at 15:31

nick_w · Answer 1 · 2012-10-16T04:55:43.397

If your deployment will be on a single server, your initial idea of a WCF service is probably the way to go - see MSDN for a discussion regarding hosting in IIS or in a Windows Service.

As @JeffWatkins said, a good pattern to follow when calling the service is to simply pass it the location of the file on disk that needs processing. This will be much more efficient when dealing with large files.

I think the precise approach taken here will depend on the nature of files you are receiving from users. In the case of quite small files you may find it more efficient to stream them to your service from your website such that they never touch the disk. In this case, your service would then expose an additional method that is used when dealing with small files.

Edit

Introducing a condition where the file may be streamed is probably a good idea, but it would be valuable for you to do some testing so you can figure out:

Whether it is worth doing
What the optimal size is for streaming versus writing to disk

My answer was based on the assumption that you were deploying to a single machine. If you are wanting something more scalable, then yes, using MSMQ would be a good way to scale your application.

See MSDN for some sample code for building a WCF/MSMQ demo app.

Thanks for your response. Actually size of users XML data files varies from 100 kilobytes to 1 gigabyte. Maybe we should introduce some condition/algorithm on size to pass it through stram or pass it as a file? What in your mind about MSMQ here? Another one idea is how this system scalable? Can I on sometime in future add some extra servers for processing? — kseen, Oct 16 '12 at 03:44

score 3 · Answer 2 · answered Oct 17 '12 at 21:46

3

I've designed something similar. We used a WCF service as the connection point, then RabbitMQ for queuing up the messages. Then, a separate service works with items in the queue, sending async callback when the task if finished, therefore finishing the WCF call (WCF has many built in features for dealing with this)

You can setup timeouts on each side, or you can even choose to drop the WCF connection and use the async callback to notify the user that "processing is finished" I had much better luck with RabbitMQ than MSMQ, FYI.

I don't have any links for you, as this is something our team came up with and has worked very well (1000 TPS with a 4 server pool, 100% stateless) - Just an Idea.

answered Oct 17 '12 at 21:46

bugnuker

3,918
7
24
31

Thank you for your response! Could you please tell me about 2nd service, separate service that processes the work items? Which tech was used to crate it? – kseen Oct 18 '12 at 04:14
It listened to the RabbitMQ service and picked up the messages (our messages were serialized objects of our class) When the "2nd" service picked up the message, it would parse the message and just do its thing. We would have loved to use 4.5 Framework for easy async operations, but it was not out yet. We just setup a couple worker threads with callbacks and spawned new ones when load was high. – bugnuker Oct 18 '12 at 14:21

score 1 · Answer 3 · answered Oct 18 '12 at 23:30

1

I would give a serious look to ServiceStack. This functionality is built-in, and you will have minimal programming to do. In addition, ServiceStack's architecture is very good and easy to debug if you do run into any issues.

https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis

On a related note, my company does a lot of asynchronous background processing with a web-based REST api front end (the REST service uses ServiceStack). We do use multiple machines and have implemented a RabbitMQ backend; however, the RabbitMQ .NET library is very poorly-designed and unnecessarily cumbersome. I did a redesign of the core classes to fix this issue, but have not been able to publish them to the community yet as we have not released our project to production.

answered Oct 18 '12 at 23:30

theMayer

15,456
7
58
90

Thank you for your response! Would you please tell me in a couple of words what if the pros and cons of using that ServiceStack instead of MSMQ? – kseen Oct 19 '12 at 02:39
Well, I have not used msmq, but you are locked in to Microsoft if you go that route. Plus their stuff always has way too much overhead, which serviceStack does not. – theMayer Oct 19 '12 at 03:29
I'm locked to Microsoft techonologies initially because I'm using .NET platform there. As I read on wikipedia there is no official port for Redis for Windows. So unfourtunately I can't expose ServiceStack there in my project? – kseen Oct 19 '12 at 03:40
As far as I know, you can use a SQL back end for the message queues. But again, I use RabbitMQ and have rolled my own. I do know that RabbitMQ is super-fast, with latency in the milliseconds. – theMayer Oct 19 '12 at 04:16

score 0 · Answer 4 · answered Oct 11 '12 at 09:35

0

Have a look at http://www.devx.com/dotnet/Article/27560

It's a little bit dated but can give you a headstart and basic understanding.

answered Oct 11 '12 at 09:35

lboshuizen

2,746
17
20

Does MSMQ allow to broadcast huge messages (XML data files can take 1GB size)? – kseen Oct 11 '12 at 10:13
Or maybe should I just send a message with an paths to files instead of embedding them into message? – kseen Oct 11 '12 at 10:15
I would go for "just" the path, transferring 1GB while you have a filesystem seems a bit redundant to me – lboshuizen Oct 11 '12 at 10:19
Please consider the "Claim cheque" pattern if you have queues and a large payload. i.e. leave your payload somewhere accessible like a database or a file system and then pass a claim cheque around rather than the payload or a path. – Jeff Watkins Oct 15 '12 at 15:32
@JeffWatkins Seems like you point me to a right way. Could you please provide some links to that material? – kseen Oct 15 '12 at 16:45

Articles on how to organize background queue operations

POST EDIT

POST EDIT #2

4 Answers4

Linked