For fastest I/O, you usually want to read/write in multiples of the block size of your filesystem/OS.
You can query the block size by calling statfs
or fstatfs
on your file or file descriptor (read the man pages).
The struct statfs
has a field f_bsize
and sometimes also f_iosize
:
optimal transfer block size
The f_bsize
field exists on all POSIX systems, AFAIK. On Mac OS X and iOS, there's also f_iosize
which is the one you'd prefer on these platforms (but f_bsize
works on Mac OS X/iOS as well and should usually be same as f_iosize
, IIRC).
struct statfs fsInfo = {0};
int fd = fileno(fp); // Get file descriptor from FILE*.
long optimalSize;
if (fstatfs(fd, &fsInfo) == -1) {
// Querying failed! Fall back to a sane value, for example 8kB or 4MB.
optimalSize = 4 * 1024 * 1024;
} else {
optimalSize = fsInfo.f_bsize;
}
Now allocate a buffer of that size and read (using read
or fread
) blocks of that size. Then iterate this in-memory block and count the number of newlines. Repeat until EOF.
A different approach is the one @Ioan proposed: use mmap
to map the file into memory and iterate that buffer. This probably gives you optimal performance as the kernel can read the data in the most efficient way, but this might fail for files that are "too large" while the approach I've described above always works with files of arbitrary size and gives you near-optimal performance.