9

I have this simple code which combines text files into one text file :

void Main()
{
const int chunkSize = 2 * 1024; // 2KB
var inputFiles = new[] { @"c:\1.txt", @"c:\2.txt", @"c:\3.txt" };
using (var output = File.Create(@"c:\output.dat"))
{
    foreach (var file in inputFiles)
    {
        using (var input = File.OpenRead(file))
        {
            var buffer = new byte[chunkSize];
            int bytesRead;
            while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
            {
                output.Write(buffer, 0, bytesRead);
            }
        }
    }
}
}

My question is about the chunkSize size.

How can I know if the number I've chosen is the right one ? (1024*2)

I'm trying to find the idle formula :

Assuming each file size is F mb , and I have R mb of Ram and the block size of my Hd is B kb - is there any formula which I can build to find the idle buffer size ?

Royi Namir
  • 144,742
  • 138
  • 468
  • 792
  • There can't be any such formula since even the "block size of HD" is "virtualized" these days... on top comes that there are differences (sometimes big ones) between OS versions/editions. IF you want maximum performance you should checkout MMF (Memory Mapped Files) which has been available in Windows for a long time and is now even part of .NET (V 4.0 and up). – Yahia Aug 13 '13 at 11:08
  • What are you trying to achieve by buffer size tuning? – Sergey Vyacheslavovich Brunov Aug 13 '13 at 11:13
  • 1
    There's also buffering of the IO by the OS, so you could do 1 byte at at time and still get workable performance (don't do this). Your best bet is to benchmark it with some different numbers (512, 1k, 2k, 4k etc) and see which one is fastest. – Ibasa Aug 13 '13 at 11:15

1 Answers1

8

4KB is a good choice. for more info look to this:
File I/O with streams - best memory buffer size

Greetings

Community
  • 1
  • 1
Bassam Alugili
  • 16,345
  • 7
  • 52
  • 70
  • I note when working with asynchronous FileStreams (`new FileStream( ..., useAsync: true );`) you'll want a larger buffer - other posts on StackOverflow suggest 80KiB to 128KiB for best performance). Also ensure that the buffer in any associated `StreamReader`/`StreamWriter`/`BinaryReader`/`BinaryWriter` is also appropriately sized. Async IO performs worse than synchronous IO with small buffers, but performs much better than synchronous IO with large buffers that compensate for the overhead of the async plumbing (especially under heavy-load where async IO frees-up those threads!) . – Dai Jun 25 '20 at 14:55