13

I have to write thousands of dynamically generated lines to a text file. I have two choices, Which consumes less resources and is faster than the other?

A. Using StringBuilder and File.WriteAllText

StringBuilder sb = new StringBuilder();

foreach(Data dataItem in Datas)
{
    sb.AppendLine(
        String.Format(
            "{0}, {1}-{2}",
            dataItem.Property1,
            dataItem.Property2,
            dataItem.Property3));
}

File.WriteAllText("C:\\example.txt", sb.ToString(), new UTF8Encoding(false)); 

B. Using File.AppendText

using(StreamWriter sw = File.AppendText("C:\\example.txt"))
{
    foreach (Data dataItem in Datas)
    {
        sw.WriteLine(
            String.Format(
                "{0}, {1}-{2}",
                dataItem.Property1,
                dataItem.Property2,
                dataItem.Property3));
    }
}
Hill
  • 463
  • 3
  • 15
Alberto León
  • 2,879
  • 2
  • 25
  • 24
  • 5
    http://ericlippert.com/2012/12/17/performance-rant/ – Soner Gönül Apr 24 '13 at 12:07
  • I need to do this operation as fast as I can, because involves networking and database writing. And I have some bottlenecks related, so I'm asking this question because I need, not because I'm not aware about the article you linked. – Alberto León Apr 24 '13 at 12:13
  • 2
    I'm not sure speed should be your main concern here. Memory usage is clearly what should drive your decision. If the data is always small then you can just run some tests to determine which is faster. – juharr Apr 24 '13 at 12:15

2 Answers2

20

Your first version, which puts everything into a StringBuilder and then writes it, will consume the most memory. If the text is very large, you have the potential of running out of memory. It has the potential to be faster, but it could also be slower.

The second option will use much less memory (basically, just the StreamWriter buffer), and will perform very well. I would recommend this option. It performs well--possibly better than the first method--and doesn't have the same potential for running out of memory.

You can speed it quite a lot by increasing the size of the output buffer. Rather than

File.AppendText("filename")

Create the stream with:

const int BufferSize = 65536;  // 64 Kilobytes
StreamWriter sw = new StreamWriter("filename", true, Encoding.UTF8, BufferSize);

A buffer size of 64K gives much better performance than the default 4K buffer size. You can go larger, but I've found that larger than 64K gives minimal performance gains, and on some systems can actually decrease performance.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • This buffer size is difficult to anticipate, because I never know the amount of lines and data that takes. I think between 2k and 4k. I had the risk of OutOfMemoryException if I arbitrary will choose a buffersize minor than what I need. – Alberto León Apr 24 '13 at 14:55
  • 1
    @AlbertoLeón: No, picking a too-small buffer size won't give you an out of memory exception. That's just a temporary buffer that the `StreamWriter` uses to prevent it from having to call the Windows `Write` function for every character. It buffers data and then writes it in blocks. You won't get an error if you try to write a block that's larger than the buffer. – Jim Mischel Apr 24 '13 at 15:10
  • great! Then I think your solution improve my second aproach, I like it. – Alberto León Apr 24 '13 at 15:45
  • "_I've found that larger than 64K gives minimal performance gains, and on some systems can actually decrease performance_". I am interested to know, How? By explicitly benchmarking or by implicitly from the real-world data in your applications? – user1451111 Sep 08 '18 at 12:22
  • 1
    @user1451111 I did not do rigorous benchmarking. I did test several of my applications at various buffer sizes. A lot has changed in the .NET Framework since I gathered the data on which the answer is based. Things might very well be different now. – Jim Mischel Sep 09 '18 at 21:43
7

You do have at least one other choice, using File.AppendAllLines()

var data = from item in Datas
            select string.Format("{0}, {1}-{2}", item.Property1, item.Property2, item.Property3);

File.AppendAllLines("Filename", data, new UTF8Encoding(false));

This will theoretically use less memory than your first approach since only one line at a time will be buffered in memory.

It will probably be almost exactly the same as your second approach though. I'm just showing you a third alternative. The only advantage of this one is that you can feed it a Linq sequence, which can be useful sometimes.

The I/O speed will dwarf any other considerations, so you should concentrate on minimising memory usage as juharr noted above (and also considering the dangers of premature optimisation, of course!)

That means using your second approach, or the one I put here.

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276