60

I am writing a small I/O library to assist with a larger (hobby) project. A part of this library performs various functions on a file, which is read / written via the FileStream object. On each StreamReader.Read(...) pass,

I fire off an event which will be used in the main app to display progress information. The processing that goes on in the loop is vaired, but is not too time consuming (it could just be a simple file copy, for example, or may involve encryption...).

My main question is: What is the best memory buffer size to use? Thinking about physical disk layouts, I could pick 2k, which would cover a CD sector size and is a nice multiple of a 512 bytes hard disk sector. Higher up the abstraction tree, you could go for a larger buffer which could read an entire FAT cluster at a time. I realise with today's PC's, I could go for a more memory hungry option (a couple of MiB, for example), but then I increase the time between UI updates and the user perceives a less responsive application.

As an aside, I'm eventually hoping to provide a similar interface to files hosted on FTP / HTTP servers (over a local network / fastish DSL). What would be the best memory buffer size for those (again, a "best-case" tradeoff between perceived responsiveness vs. performance)?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
AJ.
  • 1,621
  • 1
  • 10
  • 23
  • It may be helpful: http://stackoverflow.com/questions/19558435/what-is-the-best-buffer-size-when-using-binaryreader-to-read-big-files-1gb/19837238?noredirect=1#19837238 – Amir Pournasserian Nov 07 '13 at 13:33
  • I'd have thought that the OS or Windows would maintain its own profile of hardware capabilities and speeds and provide a service that recommends the best buffer-size for a given storage volume and activity (e.g. random read/writes vs sequential read/write) - that would take out the guesswork. – Dai Jan 14 '19 at 23:40
  • Possible duplicate of [C# FileStream : Optimal buffer size for writing large files?](https://stackoverflow.com/questions/1862982/c-sharp-filestream-optimal-buffer-size-for-writing-large-files) – Sinto Mar 28 '19 at 08:44

4 Answers4

89

Files are already buffered by the file system cache. You just need to pick a buffer size that doesn't force FileStream to make the native Windows ReadFile() API call to fill the buffer too often. Don't go below a kilobyte, more than 16 KB is a waste of memory and unfriendly to the CPU's L1 cache (typically 16 or 32 KB of data).

4 KB is a traditional choice, even though that will exactly span a virtual memory page only ever by accident. It is difficult to profile; you'll end up measuring how long it takes to read a cached file. Which runs at RAM speeds, 5 gigabytes/sec and up if the data is available in the cache. It will be in the cache the second time you run your test, and that won't happen in a production environment too often. File I/O is completely dominated by the disk drive or the NIC and is glacially slow, copying the data is peanuts. 4 KB will work fine.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Low buffer sizes like 4-8kb are also preferable because the CPU cache can hold such amounts. If you go to small you can accumulate significant overhead from kernel-transitions though. – usr Feb 19 '12 at 18:47
  • @HansPassant: My application deals with lots of small files together as well as large ones separately. Will a 4KB size adversely affect performance for files smaller than 4KB? – Raheel Khan Sep 27 '12 at 02:36
  • 4
    4KB is the default value used by .net framework: http://msdn.microsoft.com/en-us/library/dd783870.aspx – giammin Oct 31 '12 at 17:25
  • 1
    If the documentation is correct, in 4.5 they increased the default value to 81920. – Justin Helgerson Jun 04 '14 at 21:32
  • 11
    The documentation is correct, .NET Reflector shows the `_DefaultCopyBufferSize` has a value of `0x14000` (81920, or 80K). However, this is for copying from stream to stream, not buffering data. The [BufferedStream Class](http://msdn.microsoft.com/en-us/library/system.io.bufferedstream(v=vs.110).aspx) has a `_DefaultBufferSize` of `0x1000` (4096 or 4k), this would be a better class to look at for understanding what buffer size the .NET framework uses for streams. – Owain Williams Jan 07 '15 at 10:58
  • I understand when using `FileStream` with `useAsync: true`, for best async performance the buffer size should be at least 1 megabyte in order for the cost of the async overhead to be worthwhile while waiting for overlapped disk IO to complete - I don't remember where I got this detail from, but would you agree with it? – Dai Mar 05 '20 at 02:34
4

When I deal with files directly through a stream object, I typically use 4096 bytes. It seems to be reasonably effective across multiple I/O areas (local file system, LAN/SMB, network stream, etc.), but I haven't profiled it or anything. Way back when, I saw several examples use that size, and it stuck in my memory. That doesn't mean it's the best though.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Nate
  • 30,286
  • 23
  • 113
  • 184
  • Right. I wouldn't ever use anything less than 4k, since it's the smallest block managed by the virtual memory system (on which the disk cache is based). – Ben Voigt Jun 13 '10 at 20:37
3

"It depends".

You would have to test your application with different buffer sizes to determine whis is best. You can't guess ahead of time.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
0

I suppose that default value is usually the best - therefore i use 4096B based on internal const int variable DefaultBufferSize in FileStream class.

honzakuzel1989
  • 2,400
  • 2
  • 29
  • 32
  • 4
    Default is not always the best. It's just a good compromise for the more common cases, not the optimal for all loads. – Hejazzman Feb 11 '19 at 09:31