1

I have an application A which calls another application B which does some calculation and writes to a file File.txt A invokes multiple instances of B through multiple threads and each instances tries to write to same file File.txt Here comes the actual problem : Since multiple threads tries to access the same file , the file access throws out which is common.

I tried an approach of using a concurrent queue in a singleton class and each instances of B adds to the queue And another thread in this class takes care of dequeing the items from queue and writes to the file File.txt. The queue is fetched synchronously and write operation succeeded . This works fine .

If I have too many threads and too many items in queue the file writing works but if for some reason my queue crashes or stops abruptly all the information which is supposed to be written to file is lost .

If I make the file writing synchronous from the B without using the queue then it will be slow as it needs to check for file locking but here there are less chances of data being missed as after B immediately writes to file.

What could be there best approach or design to handle this scenario? I don't need the response after file writing is completed . I can't make B wait for the file writing to be completed.

Would async await file writing could be of any use here ?

rahulmr
  • 681
  • 1
  • 7
  • 19
  • 1
    If you never care about whether or not previous write has completed yet, you can do it asynchronously, as you suggested. For example, in a background task. For this you will need a concurrency lock of some form. – user5226582 Jan 09 '17 at 14:22
  • So it ok that all multithreaded B instances calls an async operation and continues its cycle without waiting for the async file writing completion ? If so can you please provide some examples ? – rahulmr Jan 09 '17 at 15:23
  • Sorry a bit late in responding, but this should give you a clue: https://stackoverflow.com/questions/6157752/usage-of-the-c-sharp-lock-keyword. Basically, you could create a shared lock, then lock it before every write to the file (and release afterwards). If a thread wants to write when the lock is occupied, you could delay it (current thread) until the lock becomes available again. – user5226582 Jan 19 '17 at 15:36
  • Or you potentially could check if the file is in use: https://stackoverflow.com/questions/876473/is-there-a-way-to-check-if-a-file-is-in-use instead of using explicit lock object - this would be useful if your application is not the only one accessing the file. – user5226582 Jan 19 '17 at 15:37

2 Answers2

1

I think what you've done is the best that can be done. You may have to tune your producer/consumer queue solution if there are still problems, but it seems to me that you've done rather well with this approach.

If an in-memory queue isn't the answer, perhaps externalizing that to a message queue and a pool of listeners would be an improvement.

Relational databases and transaction managers are born to solve this problem. Why continue with a file based solution? Is it possible to explore an alternative?

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • only if no other options other than message queue ,which is persistent or async await works we have to go for database. – rahulmr Jan 09 '17 at 15:22
1

is there a better approach or design to handle this scenario?

You can make each producer thread write to it's own rolling file instead of queuing the operation. Every X seconds the producers move to new files and some aggregation thread wakes up, read the previous files (of each producer) and writes the results to the final File.txt output file. No read / write locks are required here.

This ensures safe recovery since the rolling files exist until you process and delete them.

This also mean that you always write to disk, which is much slower than queuing tasks in memory and write to disk in bulks. But that's the price you pay for consistency.

Would async await file writing could be of any use here ?

Using asynchronous IO has nothing to do with this. The problems you mentioned were 1) shared resources (the output file) and 2) lack of consistency (when the queue crash), none of which async programming is about.

Why the async is in picture is because I dont want to delay the existing work by B because of this file writing operation

async would indeed help you with that. Whatever pattern you choose to implement (to solve the original problem) it can always be async by merely using the asynchronous IO api's.

shay__
  • 3,815
  • 17
  • 34
  • We could have used the approach you mentioned about the different files being generated .But we already have the fragmentation issues and this is something like we expect files in thousand range . So we dropped that plan of single file for each threads/invocation. Why the async is in picture is because I dont want to delay the existing work by B because of this file writing operation . So thought of handing over the filewriting to this async method and let B continue its work as before.For the file writing I am handling the lock issues using semaphore .Thanks for lookinginto the pronlem – rahulmr Jan 11 '17 at 05:24
  • Also we have to read this file concurrently from another application. So if the file is being split and then written to a single file later we may not be able to achieve the requirement with ease and perfect – rahulmr Jan 11 '17 at 05:26
  • @user3689864 please see the addition above. Also, how does your current implementation solve the fragmentation issue? I'm asking because it doesn't seem to :) – shay__ Jan 12 '17 at 07:22