18

Question

Does File.AppendAllText manage collisions from multiple writers?

Research

I noticed that the MSDN documentation doesn't really provide a position either way, so I decided I'd reflect the code and just see what it does. Below is the called method from File.AppendAllText:

private static void InternalAppendAllText(string path, string contents, Encoding encoding)
{
    using (StreamWriter streamWriter = new StreamWriter(path, true, encoding))
    {
        streamWriter.Write(contents);
    }
}

and as you can see it simply leverages a StreamWriter. So, if we dig a little deeper into that, specifically the constructor it uses, we find that it ultimately calls this constructor:

internal StreamWriter(string path, bool append, Encoding encoding, int bufferSize, bool checkHost) : base(null)
{
    if (path == null)
    {
        throw new ArgumentNullException("path");
    }
    if (encoding == null)
    {
        throw new ArgumentNullException("encoding");
    }
    if (path.Length == 0)
    {
        throw new ArgumentException(Environment.GetResourceString("Argument_EmptyPath"));
    }
    if (bufferSize <= 0)
    {
        throw new ArgumentOutOfRangeException("bufferSize", Environment.GetResourceString("ArgumentOutOfRange_NeedPosNum"));
    }
    Stream streamArg = StreamWriter.CreateFile(path, append, checkHost);
    this.Init(streamArg, encoding, bufferSize, false);
}

with the following values:

path:        the path to the file
append:      the text to append
encoding:    UTF8NoBOM
bufferSize:  1024
checkHost:   true

and further we find that the base(null) implementation doesn't really do anything but set the InternalFormatProvider to null. So, if we keep digging we find that CreateFile:

private static Stream CreateFile(string path, bool append, bool checkHost)
{
    FileMode mode = append ? FileMode.Append : FileMode.Create;
    return new FileStream(path, mode, FileAccess.Write, FileShare.Read, 4096, FileOptions.SequentialScan, Path.GetFileName(path), false, false, checkHost);
}

creates a FileStream with these parameter values:

path:         the path to the file
mode:         FileMode.Append
access:       FileAccess.Write
share:        FileShare.Read
bufferSize:   4096
options:      FileOptions.SequentialScan
msgPath:      just the file name of the path provided
bFromProxy:   false
useLongPath:  false
checkHost:    true

an so now we're finally getting somewhere because we're about to leverage the Windows API, and this is where the question really begins because that FileStream::ctor calls a method named Init. It's a pretty long method, but I'm really interested in one line:

this._handle = Win32Native.SafeCreateFile(text3,
    dwDesiredAccess,
    share,
    secAttrs,
    mode,
    num,
    IntPtr.Zero);

which of course calls CreateFile, where the parameter values are:

text3:            the full path to the file
dwDesiredAccess:  1073741824
share:            1 (FILE_SHARE_READ)
secAttrs:         null
mode:             4 (OPEN_ALWAYS)
num:              134217728 | 1048576 (FILE_FLAG_SEQUENTIAL_SCAN | FILE_FLAG_POSIX_SEMANTICS)

so, what would Windows do if I had two threads trying to access that call at the same time for the same path? Would it open the file and buffer the writes so that both consumers are allowed to write to the file? Or do I need to leverage a lock object and lock around the call to AppendAllText?

Community
  • 1
  • 1
Mike Perrenoud
  • 66,820
  • 29
  • 157
  • 232

3 Answers3

7

Only one will win for writing, and it will be the first, any subsequent attempts will fail until the write lock is released (i.e. the buffer is flushed and the file closed) - however, it could be open for reading simultaneously (permissions depending).

Read - Allows subsequent opening of the file for reading. If this flag is not specified, any request to open the file for reading (by this process or another process) will fail until the file is closed. However, even if this flag is specified, additional permissions might still be needed to access the file.

Grant Thomas
  • 44,454
  • 10
  • 85
  • 129
  • Agreed. Also note the using statement. The file is opened and closed as quickly as possible so the window of vulnerability is small. But it is there. – Steve Wellens Apr 18 '13 at 13:01
  • @SteveWellens Yeah, that window of time would be significant in any dual-access situation however small it is, but it also has a chance of not being _that_ small a window. – Grant Thomas Apr 18 '13 at 13:11
6

The key is this method:

private static Stream CreateFile(string path, bool append, bool checkHost)
{
    FileMode mode = append ? FileMode.Append : FileMode.Create;
    return new FileStream(path, mode, FileAccess.Write, FileShare.Read, 4096, FileOptions.SequentialScan, Path.GetFileName(path), false, false, checkHost);
}

It's opening with FileShare.Read, meaning that other threads or processes can open the file for reading, but no other process/thread can open it for writing.

You probably wouldn't want it to allow multiple concurrent writers. Consider writing two very large buffers. It's very likely that they would end up being interleaved.

So, yes ... if you have multiple threads that might be appending to that file, you need to synchronize access, probably with a lock.

Another option, depending on your application, would be to have a consumer thread that reads text from a queue and appends to the file. That way, only one thread has access to the file. Other threads put the messages on a queue that the writer thread services. This is pretty easy to do with BlockingCollection, but is probably overkill unless you're writing to the file on a continual basis (as in logging).

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Now that is an idea! Having **one writer that reads a queue** drastically simplifies the problem. I have more than one scenario in PROD like this, but one is logging, do you have an example handy using the `BlockingCollection` for logging? If not, no worries, I can do it. – Mike Perrenoud Apr 18 '13 at 13:12
  • Don't worry about the example my friend, it's cake. One quick question though. You mentioned that it was probably overkill unless it was for something like logging. Is that because of the overhead of the type itself (i.e. in lower transaction scenarios it would cause performance issues)? – Mike Perrenoud Apr 18 '13 at 13:18
  • 1
    @MichaelPerrenoud - If you're using it for logging, depending on how much, you may prefer using a logging framework such as `Log4Net` which should handle multithreading scenarios for you http://stackoverflow.com/questions/1519211/multithread-safe-logging – keyboardP Apr 18 '13 at 13:20
  • @Michael: There's some memory and performance overhead to managing a queue, and having a separate thread does add complexity to a program at some level, although potentially simplifies it in others (i.e. not needing a lock). If I had multiple threads that only sometimes needed to append the file, I'd be inclined to use the lock technique. But with something like logging, where the threads are always writing to the file, I'd definitely use a producer/consumer setup. – Jim Mischel Apr 18 '13 at 13:23
  • @keyboardP, I love `log4net`, but the scenario I'm thinking of will not allow me to leverage that library. It's a political thing. – Mike Perrenoud Apr 18 '13 at 13:24
  • @MichaelPerrenoud - Ah fair enough :) – keyboardP Apr 18 '13 at 13:24
  • @MichaelPerrenoud: I don't have an example at hand. I know I've posted one here before, but it's not showing up in a quick search. Suppose I should write something and post on my blog. – Jim Mischel Apr 18 '13 at 13:29
  • @JimMischel, if you post it on your blog you can bet I'll read it. I check it every day. Thanks a lot my friend, I really appreciate the answer - this is the direction I'm going to go! – Mike Perrenoud Apr 18 '13 at 13:32
2

I knew the topic is old, but I found out that after read https://stackoverflow.com/a/18692934/3789481

By implement EventWaitHandle, I can easily prevent collision with File.AppendAllText

    EventWaitHandle waitHandle = new EventWaitHandle(true, EventResetMode.AutoReset, "SHARED_BY_ALL_PROCESSES");

    Task[] tasks = new Task[ThreadCount];
    for (int counter = 0; counter < ThreadCount; counter++)
    {
        var dividedList = ....
        tasks[counter] = await Task.Factory.StartNew(async () => await RunTask(counter, dividedList, waitHandle));
    }

And the RunTask write to file

    private static async Task RunTask(int threadNum, List<string> ids, EventWaitHandle waitHandle)
    {
        Console.WriteLine($"Start thread {threadNum}");

        foreach (var id in ids)
        {
            // start waiting
            waitHandle.WaitOne();
            File.AppendAllText(@".\Result.txt", text + Environment.NewLine);
            waitHandle.Set();
            // until release

            // ninja code
        }
        Console.WriteLine($"End thread {threadNum}");
    }

I have tested with 500 threads, it worked well !!

Tấn Nguyên
  • 1,607
  • 4
  • 15
  • 25