I've pondered basically the same question, and I've come to a very simple conclusion:
Use a conservative default or heuristic, but let the user override it easily if they want.
You see, in some cases the user might not want the maximum throughput for your utility, but perhaps do whatever it is on the background. Perhaps the task is just not that important. Personally, in Linux, I often use nice
and ionice
utilities to put long-but-not-priority tasks on the back burner, so to speak, so that they don't interfere with my actual work.
Benchmarks within the last decade indicate 128k to 2M block sizes (217 to 221 bytes) to consistently work well -- not far from optimal rates in almost all situations --, with the average slowly shifting towards the larger end of that range. Typically, powers of two sizes seem to work better than non-powers-of-two, although I haven't seen enough benchmarks of various RAID configurations to trust that fully.
Because your utility will almost certainly be recompiled for each new hardware type/generation, I'd prefer to have a default block size, defined at compile time, but have it trivially overridden at run time (via a command-line option, environment variable, and/or configuration file).
If your utility is packaged for current POSIXy OSes, the binaries could use a default that seems to suit best for the types of tasks done on that machine; for example, Raspberry Pis and other SBCs often don't have that much memory to start with, so a smaller (say, 65536 bytes) default block size might work best. Desktop users might not care about memory hogs, so you might use a much larger default block size on current desktop machines.
(Servers, and in high performance computing (which is where I've pondered about this), the block size is basically either benchmarked on the exact hardware and workload, or it is just a barely-informed guess. Typically the latter.)
Alternatively, you could construct a heuristic based on the st_blksize
s of the files involved, perhaps multiplied by a default factor, and clamped to some preferred range. However, such heuristics tend to bit-rot fast, as hardware changes.
With heuristics, it is important to remember that the idea is not to always achieve the optimum, but to avoid really poor results. If a user wants to squeeze out the last few percent of performance, they can do some benchmarking within their own workflow, and tune the defaults accordingly. (I personally have, and do.)