I made a program which read characters in a file, many times, in a loop. If I don't care about memory usage, is storing all characters of the file in an array faster than accessing characters with fgetc ?
-
5disk read/write access is slow compared to access memory!! – Steephen Jan 01 '16 at 17:12
-
1You should include in the question the exact pieces of code that you want to compare. – Ilya Jan 01 '16 at 17:14
-
I now but maybe there is a buffer for fgetc – Spooky Jan 01 '16 at 17:14
-
Even so, system calls are slow next to local code. Have you not even tested the speed for yourself? – Weather Vane Jan 01 '16 at 17:53
-
I'm not gonna do it, I just realize that for every few calls to fgetc (to get a word), the program go through an array of 300.000+ strings linearly. So fgetc is not the big problem. – Spooky Jan 01 '16 at 18:07
2 Answers
In general, it's impossible to answer performance questions without knowing the details of the platform and the exact code you want to compare. However, in this case, buffering the file contents in an array is likely to be much faster on most platforms.
For one, disk is orders of magnitude slower than main memory.
And even if your OS (or libc) caches the data in RAM, fgetc
still performs a system call to get it, which is likely much slower than a simple memory read.
Also because of the relative slowness of system calls, use fread
instead of fgetc
to read a block of bytes in a single call.

- 174,939
- 50
- 355
- 478
-
`fread` reads bytes, which are pretty much equivalent to `char`s ([citation](http://stackoverflow.com/questions/2215445/are-there-machines-where-sizeofchar-1-or-at-least-char-bit-8)). – Thomas Jan 01 '16 at 17:20
-
`fgetc` still does buffering, there aren't any more system calls involved than if you'd do an `fread`. – fuz Jan 01 '16 at 17:21
-
-
@FUZxxl Not necessarily -- I can't find this requirement in the standard. Even if it buffers, anecdotally [`fgetc` still turns out to be slower](http://stackoverflow.com/questions/13225014/why-fgetc-too-slow). – Thomas Jan 01 '16 at 17:24
-
1@Spooky No, use `fread` to read the data straight into your array. You may need a cast. – Thomas Jan 01 '16 at 17:25
-
@Thomas That's why you should use `getc` if possible, but still, glibc is braindead and doesn't implement that as a macro. – fuz Jan 01 '16 at 17:41
-
@Thomas See ISO 9899:2011 §7.21.3 “(...) When a stream is *fully buffered*, characters are intended to be transmitted to or from the host environment as a block when a buffer is filled. (...)” This applies to all stdio functions, including `fgetc` and `fread`. On all but the most exotic platforms, streams are buffered by default although the standard leaves that open. – fuz Jan 01 '16 at 18:44
-
@Thomas Yes, one `fread` is faster than a series of `fgetc` calls, but `getc` calls are often still as fast as manually keeping track of another layer of buffering and often make for easier code. When speed is needed, consider `getc_unlocked()`. – fuz Jan 01 '16 at 18:44
I think you should at least use some form of buffering and not read a character at a time to fill the buffer or array.
Better use fread()
to fill a buffer/array, or you might even look into memory mapping (mmap
), to avoid copyiing data from the disk cache in kernel mode to a buffer in user mode if you want slightly more performance (since your question is tagged performance
too). Although, for a single read pass, your harddisk will certainly be the botlleneck.
If you only need to read the data once, fread()
with buffer(s) might be the way to go.

- 11,201
- 1
- 24
- 46