1

I recently have need of reading a non-trivially sized file line by line, and to push performance, I decided to follow some advice I've gotten which states that fstreams are much slower than C style I/O. However despite my best efforts, I have not been able to reproduce the same dramatic differences ( ~25% which is large but not insane ). I also tried out fscanf and found out it is slower by a magnitude.

My question is what is causing the performance difference under the covers and why is fscanf abyssmal?

The following is my code (compiled with TDM GCC 5.1.0):

struct file
{
    file(const char* str, const char* mode)
        : fp(fopen(str, mode)){}
    ~file(){fclose(fp);}
    FILE* fp;
};

constexpr size_t bufsize = 256;
auto readWord(int pos, char*& word, char* const buf)
{
    for(; buf[pos] != '\n'; ++word, ++pos)
    {
        if(pos == bufsize)
            return 0;
        *word = buf[pos];
    }
    *word = '\0';
    return pos + 1;
}

void readFileC()
{
    file in{"inC.txt", "r"};
    char buf[bufsize];
    char word[40];

    char* pw = word;
    int sz = fread(buf, 1, bufsize, in.fp);
    for(; sz == bufsize; sz = fread(buf, 1, bufsize, in.fp))
    {
        for(auto nextPos = readWord(0, pw, buf); (nextPos = readWord(nextPos, pw, buf));)
        {
            //use word here
            pw = word;
        }
    }

    for(auto nextPos = readWord(0, pw, buf); nextPos < sz; nextPos = readWord(nextPos, pw, buf))
    {
        //use word here
        pw = word;
    }
}

void readFileCline()
{
    file in{"inCline.txt", "r"};
    char word[40];
    while(fscanf(in.fp, "%s", word) != EOF);
        //use word here
}

void readFileCpp()
{
    ifstream in{"inCpp.txt"};
    string word;
    while(getline(in, word));
        //use word here
}

int main()
{
    static constexpr int runs = 1;

    auto countC = 0;
    for(int i = 0; i < runs; ++i)
    {
        auto start = steady_clock::now();
        readFileC();
        auto dur = steady_clock::now() - start;
        countC += duration_cast<milliseconds>(dur).count();
    }
    cout << "countC: " << countC << endl;

    auto countCline = 0;
    for(int i = 0; i < runs; ++i)
    {
        auto start = steady_clock::now();
        readFileCline();
        auto dur = steady_clock::now() - start;
        countCline += duration_cast<milliseconds>(dur).count();
    }
    cout << "countCline: " << countCline << endl;

    auto countCpp = 0;
    for(int i = 0; i < runs; ++i)
    {
        auto start = steady_clock::now();
        readFileCpp();
        auto dur = steady_clock::now() - start;
        countCpp += duration_cast<milliseconds>(dur).count();
    }

    cout << "countCpp: " << countCpp << endl;
}

Ran with a file of size 1070KB these are the results :

countC: 7
countCline: 61
countCpp: 9

EDIT: three test cases now read different files and run for once. The results is exactly 1/20 of reading the same file 20 times. countC is consistently outperforming countCpp even when I flipped the order at which they are performed

Community
  • 1
  • 1
Passer By
  • 19,325
  • 6
  • 49
  • 96
  • 8
    Reading the **same** file twice and compare reading spead is dubious it best. Disk+OS-level caching is making your results close to random. – SergeyA May 10 '16 at 19:44
  • What is the contents of the file? Is it space-separated words or newline-separated words? – kfsone May 10 '16 at 19:44
  • 2
    The accepted answer (to the question you link to) is wrong, and the comments underneath it explain why. Its author was performing flushing after _every single line_ in the C++ versions, but not the C versions. In fact, there are a lot of utter crap answers over there. **Read the peer review comments instead of taking everything you see at face value.** – Lightness Races in Orbit May 10 '16 at 19:45
  • 1
    Comparing `fread` and `fscanf` is also strange. fscanf is formatted input, fread just reads a buffer. Nobody concerned with performance should use formatted IO. – SergeyA May 10 '16 at 19:45
  • You should clearly separate two elements of the puzzle - reading the buffer and parsing the buffer. In my experience, reading the buffer is approximately the same speed for both `fstream` (using `read`) and `fread`. – SergeyA May 10 '16 at 19:48
  • @LightnessRacesinOrbit, should we actually do something about this bad answer? Edit it? I went and downvoted it, but I doubt it's enough. – SergeyA May 10 '16 at 19:49
  • `fread(buf, 1, bufsize, in.fp);` needs to only read. `fscanf(in.fp, "%s", word) ` needs to 1) scan and skip leading white-space, scanf and save non-white-space until reading a white-space and then 3) put that whitespace back for the next input call. On the plus side, it does not need to check for a width limit. So AFAIK, code has UB because input exceeds the buffer. So @OP, what is the evidence that code does not overfill `char word[40];`? – chux - Reinstate Monica May 10 '16 at 19:51
  • @LightnessRacesinOrbit I have seen the comment on `endl`s but I was referring to the part where he edited and said without the `endl` it would still have 3x speed difference – Passer By May 11 '16 at 04:37
  • @chux the 40 byte buffer is an arbitrary limit I put there, simply because my in my case it is enough. Am I wrong to believe that if I used anything under a cache line, it will not make a performance difference? – Passer By May 11 '16 at 04:40
  • @chux I also tried changing `char word[40]` to `char word[65536]` and it doesn't actually produce a noticeable difference – Passer By May 11 '16 at 04:52
  • It's not clear that your readFileC and readFileCpp do the same thing. As posted they look very different. – n. m. could be an AI May 11 '16 at 05:04

1 Answers1

2

fscanf has to parse the format string parameter, looking for all possible % signs, and interpreting them, along with width-specifers, escape characters, expressions, etc. It has to walk the format parameter more-or-less one character at a time, working through a very big set of potential formats. Even if your format is as simple as "%s", it still is a lot of overhead involved relative to the other techniques which simply grab a bunch of bytes with almost no overhead of interpretation / conversion, etc.

abelenky
  • 63,815
  • 23
  • 109
  • 159
  • Note that `fscanf` _may_ not have to parse the format string parameter at run time, each function call, as smart compilers can create fast code from simple constant formats. – chux - Reinstate Monica May 10 '16 at 19:48
  • 1
    I haven't heard of that type of optimization, but even if it can optimize the format parameter, it still needs to parse the input bit-by-bit at runtime make it fit the `%s` format. – abelenky May 10 '16 at 20:04
  • 1
    [Agree](http://stackoverflow.com/questions/37147524/comparison-of-c-and-c-file-read-performance/37147596?noredirect=1#comment61833391_37147524) about the parsing of input: one character at a time. – chux - Reinstate Monica May 10 '16 at 20:13
  • This certainly makes sense, I believe this would be a full answer if you could also add in why `fstream` is slower than C style file read? – Passer By May 11 '16 at 04:54