fopen vs open (in C) in POSIX

Question

After this question , a doubt comes up:

Once fopen() is faster than open() (for sequential writing/ready operations at least), the former is a library and the latter is a system call, what system call does fopen() do in a POSIX compilant OS?

It depends on the implementation, but usually `fopen` ends up calling `open`. — Jabberwocky, Oct 14 '21 at 13:02
I peeked at glibc and it would seem that it _maybe_ uses `mmap` depending on use-case. — Lundin, Oct 14 '21 at 13:05
@Lundin But to call `mmap` it has to call `open` first (except maybe for `open_memstream`, but that's a totally different scenario) — zwol, Oct 14 '21 at 13:08
@zwol I only took a brief glance at the code but there were different use-cases for read-only etc. — Lundin, Oct 14 '21 at 13:12
@Lundin Yeah, all I'm saying is that _whatever_ it conditionally uses `mmap` for, that must be as an alternative to `read`/`write`, not `open`. — zwol, Oct 14 '21 at 13:44
`fopen()` is certainly not faster than `open()` (probably much slower as it has to set up buffering) but it may result in faster I/O later on. — mfro, Oct 14 '21 at 14:46

zwol · Accepted Answer · 2021-10-14T13:17:16.950

It calls open.

fopen itself is not faster than open; it can't be, it's open plus some extra work. The performance benefit, described in the linked question, is from the "buffering" done by the FILE object over its entire lifecycle. The actual optimization is to call the write primitive fewer times, providing more data each time.

Here is a simple way to demonstrate the effect: Compile this program.

#define _XOPEN_SOURCE 700
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    if (argc != 3) return 1;
    long count = atol(argv[1]);
    long chunk = atol(argv[2]);
    if (count < 1 || chunk < 0) return 1;

    FILE *sink = fopen("/dev/null", "wb");
    if (!sink) return 1;

    if (chunk) {
        char *buf = malloc(chunk);
        if (!buf) return 1;
        if (setvbuf(sink, buf, _IOFBF, chunk)) return 1;
    } else {
        if (setvbuf(sink, 0, _IONBF, 0)) return 1;
    }

    while (count--) putc_unlocked('.', sink);
    return 0;
}

It takes two arguments: the total number of bytes to write, and the size of the output buffer, in that order. Run it with various values of both parameters and time its performance; you should see that, e.g.

./a.out $((1024*1024*128)) 1

is much slower than

./a.out $((1024*1024*128)) 8192

The first number will need to be quite large for the difference to be measurable. Once you've played around with that for a while, run

strace ./a.out 50 1

and

strace ./a.out 50 50

to understand the difference in what's going on at the system call level. (If you are using an OS other than Linux, replace strace with whatever the equivalent command is for that OS.)

Additionally, using the stdio functions other than `fopen()` is not *inherently* faster than using the POSIX I/O functions. One can perform one's own buffering with the latter, for example, and possibly even tune that better to the use case at hand. But the stdio functions certainly do hide a significant amount of low-level detail, and moreover, provide a well tested implementation of those details. — John Bollinger, Oct 14 '21 at 13:14
In my system, 1 minute and 20,0762 seconds against 0.762 seconds. About 106 times slower! — Daniel Bandeira, Oct 14 '21 at 23:22

fopen vs open (in C) in POSIX

1 Answers1