49

Here is my situation. I would like to make writing to the file system as efficient as possible in my application. The app is multi-threaded and each thread can possibly write to the same file. Is there a way that I can write to the file asynchronously from each thread without having the writes in the different threads bang heads together, so to speak?

I'm using C# and .NET 3.5, and I do have the Reactive Extensions installed as well.

Jeffrey Cameron
  • 9,975
  • 10
  • 45
  • 77

7 Answers7

29

For those who prefer code, I am using following to do remote logging from web apps...

public static class LoggingExtensions
{
    static ReaderWriterLock locker = new ReaderWriterLock();
    public static void WriteDebug(this string text)
    {
        try
        {
            locker.AcquireWriterLock(int.MaxValue); //You might wanna change timeout value 
            System.IO.File.AppendAllLines(Path.Combine(Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().GetName().CodeBase).Replace("file:\\", ""), "debug.txt"), new[] { text });
        }
        finally
        {
            locker.ReleaseWriterLock();
        }
    }
}

Hope this saves you some time

Matas Vaitkevicius
  • 58,075
  • 31
  • 238
  • 265
  • Very nice! Missing one thing from the method signature if you're trying to make this an extension method of Strings. I would prepend "string text" with "this string text" for the first parameter. – Jason Foglia Jan 06 '17 at 14:52
  • 4
    The question specified *asynchronously*. This is all synchronous. – Servy Jan 06 '17 at 15:11
  • 4
    @Servy I used this in 3 separate threads and all the text made it to the file. Synchronous or not this solution worked for me without losing data. Also, forgive me if I'm wrong but isn't IO synchronous only? The solution for async is to collect strings while IO is locked and once unlocked write collection to file with the hope of sustaining order. – Jason Foglia Jan 06 '17 at 15:26
  • @JasonFoglia I didn't say that it wouldn't write to the file, I said it would do so synchronously, and the question *specifically* specified that they wanted to do so asynchronously. IO is in fact inherently *asynchronous*, not *synchronous*. Any synchronous IO operations you have involve an inherently asynchronous operation and some work done to explicitly sleep the thread until that asynchronous operation finishes. The solution for doing this asynchronously is to simply call asynchronous file IO (and synchronization) operations; that's pretty much it. – Servy Jan 06 '17 at 15:31
  • @Servy Yes, you are correct that IO operations are asynchronous, I meant on a single file? I'm not sure that you can read and write to the same file at the same time or write to the file at the same time from multiple processes or threads? – Jason Foglia Jan 06 '17 at 15:38
  • @JasonFoglia You most certainly can't (safely) read and write to a single file at the same time. The question says nothing about operating on the file *in parallel*, it says that it wants to write to the file *asynchronously*, which would mean asynchronously waiting until nobody is using the file, then asynchronously writing to it. Asynchronous doesn't mean parallel. – Servy Jan 06 '17 at 15:39
  • @Servy I stand corrected. I guess to make the code above asynchronous or at least the illusion of, is to wrap it in a Task. – Jason Foglia Jan 06 '17 at 15:53
  • @JasonFoglia So you want to schedule a thread pool thread to sit there and do nothing while an inherently asynchronous operation is actually performed instead of just performing the inherently asynchronous operation and just *not* blocking the current thread? – Servy Jan 06 '17 at 15:57
  • @Servy provide an answer then. No need to argue with me. I never said this answer was correct, I just said it was nice. It works for my situation with multiple threads, wrapping it in a Task would make it asynchronous. If you're looking for a performance solution look at @ µBio and @ Bob Bryan solutions. – Jason Foglia Jan 06 '17 at 16:21
  • @JasonFoglia So you posted an answer that you knew was wrong and doesn't answer the question just because...why? Why post an answer that you know doesn't answer the question? There are already other answers, so I see no reason to write my own. – Servy Jan 06 '17 at 16:24
  • @Servy I broke the dam not Jason... Since it solves the problem for which people come here..., I hate when I get given links to investigate, instead of code I could copy paste... It would be pointless to wrap it in Task though... – Matas Vaitkevicius Jan 06 '17 at 16:36
  • @MatasVaitkevicius You assume that people are coming to a question about how to write to a file asynchronously in order to get code that writes to a file synchronously, rather than asynchronously? What's your basis for that assumption? The answer doesn't answer the question. If you want to post code, rather than a link, that's fine, but you should still *be correctly answering the question* with that code. – Servy Jan 06 '17 at 16:46
  • Doesn't @Bob Bryan mention below avoiding using asynchronous methods on IO? This solution solves the problem with multiple threads as that is apart of the question. The correct answer this would seem! – Jason Foglia Jan 06 '17 at 16:48
  • 2
    @Servy I am assuming... however most of people don't know shit... And when they are given solution that works, they are usually grateful... In my defence I can say that it was OP who started it... If I would give you code that would write to file asynchronously from multiple threads you would get gibberish.... and you know it... so whole argument about correctness of answer is somewhat pointless. Yes I am giving incorrect answer to incorrect question that solves problem for those who come here, everyone (but you) is happy, lol... Don't believe me post correct answer and see what happens... – Matas Vaitkevicius Feb 09 '17 at 03:21
  • @Servy Just a short example of multithread async writing that does not block and does not write gibberish... – Matas Vaitkevicius Feb 10 '17 at 15:36
  • @MatasVaitkevicius See the accepted answer, as I've said before. – Servy Feb 10 '17 at 15:42
  • Yeah i think a lot of the people who are coming here are perfectly content with this solution, though I see what @Servy is saying... not quite exactly what the question asked. but +1 for sure because of code i can copy-paste and use to get what I need. – Nicholas DiPiazza Jun 26 '18 at 00:54
15

Have a look at Asynchronous I/O. This will free up the cpu to continue with other tasks.
Combine with ReaderWriterLock as @Jack B Nimble mentioned

If by

writing to the file system as efficient as possible

you mean making the actual file I/O as fast as possible you are going to have a hard time speeding it up much, disk is just physically slower. Maybe SSD's?

brichins
  • 3,825
  • 2
  • 39
  • 60
µBio
  • 10,668
  • 6
  • 38
  • 56
  • 5
    This doesn't seem to address the question, the "bang heads together" issue. – Hans Passant Aug 18 '10 at 01:11
  • Yes, how does one prevent the asynchronous writes from colliding? – Jeffrey Cameron Aug 18 '10 at 14:23
  • 3
    The same way you handle any resource contention in a multi-threaded system. Locks. ReadWriterLock (or ReaderWriterLockSlim with 4.0 / Parallel Extensions + 3.5) allows multiple reads to occur concurrently, so if that is something you want, use it. – µBio Aug 18 '10 at 16:03
  • 1
    Current Asynchronous I/O can be found [here](https://msdn.microsoft.com/en-us/library/kztecsys(v=vs.110).aspx) – MaLiN2223 Apr 29 '16 at 14:40
  • How can you wait for a ReaderWriteLock anychronously? – TigerBear Dec 19 '16 at 12:59
9

What I would do is have separate worker thread(s) dedicated to the task of writing files out. When one of your other threads needs to write some data out, it should call a function to add the data to an ArrayList (or some other container/class). Inside this function, there should be a lock statement at the top to prevent more than one thread from executing simultaneously. After adding the reference to the ArrayList it returns and continues on with its chores. There are a couple of ways to handle the writing thread(s). Probably the simplest is to simply put it into an infinite loop with a sleep statement at the end so that it does not chew up your cpu(s). Another way is to use thread primitives and go into a wait state when there is no more data to be written out. This method implies that you would have to activate the thread with something like the ManualResetEvent.Set method.

There are many different ways to read in and write out files in .NET. I have written a benchmark program and give the results in my blog:

http://designingefficientsoftware.wordpress.com/2011/03/03/efficient-file-io-from-csharp

I would recommend using the Windows ReadFile and WriteFile methods if you need performance. Avoid any of the asynchronous methods since my benchmark results show that you get better performance with synchronous I/O methods.

radbyx
  • 9,352
  • 21
  • 84
  • 127
Bob Bryan
  • 3,687
  • 1
  • 32
  • 45
6

While thread based locks can solve this, there is a manner which works across threads, but is probably best used when you have multiple processes writing to the end of a single file.

To get this behavior across processes (or threads too), specify that you want atomic append writes to the operating system when the OS file handles are created. This is done by specifying O_APPEND under Posix(Linux,Unix), and FILE_APPEND_DATA under Windows.

In C# you don't call the OS 'open', or 'CreateFile' system calls directly, but there are ways to get this result.

I asked how to do this under Windows a while ago, and got two good answers here: How can I do an atomic write/append in C#, or how do I get files opened with the FILE_APPEND_DATA flag?

Basically, you can use FileStream() or PInvoke, I would suggest FileStream() over PInvoke for obvious reasons.

You can use constructor arguments to FileStream() to specify asynchronous file I/O in addition to the FileSystemRights.AppendData flag, which should give you both async I/O and atomic append writes to a file.

Warning: Some OSes have limits on the maximum number of bytes that can be atomically written this way, and exceeding that threshold will remove the OS promise of atomicity.

Because of this last gotcha, I would recommend staying with lock() style contention management when trying to address your problem within a single process.

Community
  • 1
  • 1
Cameron
  • 2,903
  • 1
  • 30
  • 31
5

Use Reader / Writer locks to access the file stream.

Jack B Nimble
  • 5,039
  • 4
  • 40
  • 62
1

Save to log with Queue and multiple threads (.Net Core 2.2 linux sample - tested)

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
// add
using System.Security.Cryptography.X509Certificates;
using System.Threading;
using System.Net;
using System.Net.Sockets;
using System.Net.Security;
using System.Security.Authentication;
using System.IO;
using System.Timers;

namespace LogToFile
{
    class Program
    {
        public static Logger logger = new Logger("debug.log");

        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");

            logger.add("[001][LOGGER STARTING]");

            Thread t0 = new Thread(() => DoWork("t0"));
            t0.Start();

            Thread t1 = new Thread(() => DoWork("t1"));
            t1.Start();

            Thread t2 = new Thread(() => DoWork("t2"));
            t2.Start();

            Thread ts = new Thread(() => SaveWork());
            ts.Start();
        }

        public static void DoWork(string nr){
            while(true){
                logger.add("Hello from worker .... number " + nr);
                Thread.Sleep(300);
            }
        }

        public static void SaveWork(){
            while(true){
                logger.saveNow();
                Thread.Sleep(50);
            }
        }
    }

    class Logger
    {
        // Queue import: 
        // using System.Collections
        public Queue logs = new Queue();
        public string path = "debug.log";

        public Logger(string path){
            this.path = path;
        }

        public void add(string t){
            this.logs.Enqueue("[" + currTime() +"] " + t);
        }

        public void saveNow(){
            if(this.logs.Count > 0){
                // Get from queue
                string err = (string) this.logs.Dequeue();
                // Save to logs
                saveToFile(err, this.path);
            }
        }

        public bool saveToFile(string text, string path)
        {
            try{
                // string docPath = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
                // text = text + Environment.NewLine;
                using (StreamWriter sw = File.AppendText(path))
                {
                    sw.WriteLine(text);
                    sw.Close();
                }
            }catch(Exception e){
                // return to queue
                this.logs.Enqueue(text + "[SAVE_ERR]");
                return false;
            }
            return true;
        }

        public String currTime(){
            DateTime d = DateTime.UtcNow.ToLocalTime();
            return d.ToString("yyyy-MM-dd hh:mm:ss");
        }
    }
}

Compile (Save to: LogToFile/Program.cs):

dotnet new console -o LogToFile
cd LogToFile
dotnet build
dotnet run

Stop app CTRL+C and see log file

cat debug.log
0

You can use events for logger:

using System;
using System.IO;
using System.Threading;
using System.Threading.Tasks;

namespace EventLogger
{
    class Program
    {
        static void Main(string[] args)
        {                   
            // Event handler
            LogData ld = new LogData();
            // Logger
            Logger lo = new Logger();                    
            // Subscribe to event
            ld.MyEvent += lo.OnMyEvent;     
            // Thread loop
            int cnt = 1;
            while(cnt < 5){         
                Thread t = new Thread(() => RunMe(cnt, ld));
                t.Start();
                cnt++;
            }
            Console.WriteLine("While end");
        }

        // Thread worker
        public static void RunMe(int cnt, LogData ld){
            int nr = 0;
            while(true){
                nr++;
                // Add user and fire event
                ld.AddToLog(new User(){Name = "Log to file Thread" + cnt + " Count " + nr, Email = "em@em.xx"});
                Thread.Sleep(1);
            }
        }
    }

    class LogData
    {
        public delegate void MyEventHandler(object o, User u);
        public event MyEventHandler MyEvent;

        protected virtual void OnEvent(User u)
        {
            if(MyEvent != null){
                MyEvent(this, u);
            }

        }

        // Wywołaj
        public void AddToLog(User u){
            Console.WriteLine("Add to log.");

            // Odpal event
            OnEvent(u);

            Console.WriteLine("Added.");
        }
    }

    class User
    {
        public string Name = "";
        public string Email =  "";
    }

    class Logger
    {
        // Catch event
        public void OnMyEvent(object o, User u){
            try{
                Console.WriteLine("Added to file log! " + u.Name + " " + u.Email);
                File.AppendAllText(@"event.log", "Added to file log! " + u.Name + " " + u.Email+"\r\n");
            }catch(Exception e){
                Console.WriteLine("Error file log " + e);
            }
        }
    }
}