0

I have a Windows Service running that is event driven and each event is handled by its own thread. Now the events are usually coming at a rate of around 10 per second. Last action of the event is to save data to a database.

If saving the data fails, because a connection cannot be established to the Database, I want the first thread that encounters this problem to start a new reconnecting Task running at some interval (e.g. every 30 seconds). Any following threads I wish to simply end. There should be only one reconnecting Task running at any time.

How do I code so that the following event threads safely know that there is already a reconnection Task running and end it's life? Maybe there is a good design pattern for this?

EDIT: Following FelixD's suggested links: Does cancelling a cancellation token trigger an exception in the Task?

If so, I could probably catch the exception, so it would save the incoming data in a file rather than commit it to the database. I have tried to search for this, but it is not clear to me what happens when a Task is cancelled.

Mr. Blonde
  • 711
  • 2
  • 12
  • 27
  • 1
    Have a look at [CancellationToken](https://msdn.microsoft.com/en-us/library/system.threading.cancellationtoken(v=vs.110).aspx) – Felix D. Sep 25 '17 at 10:06
  • @FelixD. If I understand the idea correctly, the first thread starting the reconnection job would cancel other running threads? As I understand the Cancellation Token, the threads will stop "mid-job". I my scenario I would like to save the data from the events before they are cancelled (and commit it when connection is up). Can I accomplish this with Cancellation tokens? – Mr. Blonde Sep 25 '17 at 10:26
  • 1
    I guess so .. does [this](https://stackoverflow.com/q/12949024/4610605) also help you ? – Felix D. Sep 25 '17 at 10:57
  • Why specifically have the first thread try to reconnect and dispose of all the others? Is it equally valid to dispose of all threads and then start a new one which checks the connection? – Flater Sep 25 '17 at 11:20
  • @Flater It can be any of the started threads. By the first thread I mean the first thread that cannot commit to the database. This thread would start a reconnecting Task. Following threads encountering the error should not start another reconnect task. – Mr. Blonde Sep 25 '17 at 11:23
  • @Mr.Blonde: You're missing the point of my question. Why does it need to be an **existing** thread? Why not simply start a new thread, one that is specifically set up to check the connection? – Flater Sep 25 '17 at 11:24
  • @Flater This is indeed how I want to do it. Sorry if my wording was unclear. – Mr. Blonde Sep 25 '17 at 11:25
  • 1
    @Mr.Blonde you *shouldn't* be keeping a single connection for all threads. Open the connection as needed inside a `using` block. ADO.NET uses connection pooling so you *don't* have to open a new connection each time. By trying to use a global connection you end up *hurting* performance and scalability, as locks can accumulate and lead to excessive blocking and waits – Panagiotis Kanavos Sep 25 '17 at 11:37
  • @Mr.Blonde BTW 10/sec is low traffic. If you had more traffic, you could batch records together and send them to the database with `SqlBulkCopy`. 10/sec probably isn't enough to justify this – Panagiotis Kanavos Sep 25 '17 at 11:39
  • @Mr.Blonde another idea is to use TPL Dataflow and eg BatchBlock to batch events together and send them to an ActionBlock that will insert the entire batch in the database. – Panagiotis Kanavos Sep 25 '17 at 11:40
  • This sounds like premature optimization - I agree with the above that you should be opening a new 'connection' instance for each event and let the database optimize the throughput for you. Make it faster when you find it too slow. Caching is _hard_. – Gusdor Sep 25 '17 at 11:42
  • @PanagiotisKanavos I edited my question. I do create one connection for each data commit. My problem is that I go into an "offline" state when data cannot be committed, and I want only one Task checking if the connection can reestablished. I tried to accomplish this already, but the service log suggests that more than one reconnecting job is started. – Mr. Blonde Sep 25 '17 at 11:42

2 Answers2

2

I suggest creating a BlockingCollection, and a single worker thread that continually monitors it. When one of your receiver threads gets a message, it processes the message and then, rather than trying to write it to the database, writes it to the BlockingCollection.

The worker thread that monitors the BlockingCollection can have a single persistent connection to the database, and can write individual records, or batch them to do a bulk update, or write the records to a file if the database connection fails for some reason.

This kind of thing is easy enough to set up working directly with BlockingCollection, or you can use TPL Dataflow.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Would a ConcurrentQueue be equivalent to a BlockingCollection if I need FIFO? – Mr. Blonde Sep 25 '17 at 13:28
  • 2
    @Mr.Blonde: The default blocking store for `BlockingCollection` is a `ConcurrentQueue`. `BlockingCollection` is significantly easier to work with. See https://stackoverflow.com/questions/19847905/threading-and-asynchronous-operations-in-c-sharp/19848796#19848796 for a simple example. – Jim Mischel Sep 25 '17 at 13:33
  • Thanks. I believe can accomplish what I want with your design suggestion. – Mr. Blonde Sep 25 '17 at 13:35
0

Loss of connection a database is usually described as a transient fault. Often it occurs because the database cannot service the traffic that is being thrown at it, so it starts rejecting or timing out connections.

Have you actually observed this fault occurring or are you pre-empting it?

You should handle a transient fault by waiting for a period and then retrying the operation a number of times, including opening a new connection. Do this for all threads. If all retries fail, be prepared to fall back to a known state and return an error.

I find Polly (by the .Net foundation) to be very useful for handling this behaviour in structured way. http://www.thepollyproject.org/

Gusdor
  • 14,001
  • 2
  • 52
  • 64