10

Recently I decided to optimize some file reading I was doing, because as everyone says, reading a large chunk of data to a buffer and then working with it is faster than using lots of small reads. And my code certainly is much faster now, but after doing some profiling it appears memcpy is taking up a lot of time.

The gist of my code is...

ifstream file("some huge file");
char buffer[0x1000000];
for (yada yada) {
    int size = some arbitrary size usually around a megabyte;
    file.read(buffer, size);
    //Do stuff with buffer
}

I'm using Visual Studio 11 and after profiling my code it says ifstream::read() eventually calls xsgetn() which copies from the internal buffer to my buffer. This operation takes up over 80% of the time! In second place comes uflow() which takes up 10% of the time.

Is there any way I can get around this copying? Can I somehow tell the ifstream to buffer the size I need directly into my buffer? Does the C-style FILE* also use such an internal buffer?

UPDATE: Due to people telling me to use cstdio... I have done a benchmark.

EDIT: Unfortunately the old code was full of fail (it wasn't even reading the entire file!). You can see it here: http://pastebin.com/4dGEQ6S7

Here's my new benchmark:

const int MAX = 0x10000;
char buf[MAX];
string fpath = "largefile";
int main() {
    {
        clock_t start = clock();
        ifstream file(fpath, ios::binary);
        while (!file.eof()) {
            file.read(buf, MAX);
        }
        clock_t end = clock();
        cout << end-start << endl;
    }
    {
        clock_t start = clock();
        FILE* file = fopen(fpath.c_str(), "rb");
        setvbuf(file, NULL, _IOFBF, 1024);
        while (!feof(file)) {
            fread(buf, 0x1, MAX, file);
        }
        fclose(file);
        clock_t end = clock();
        cout << end-start << endl;
    }
    {
        clock_t start = clock();
        HANDLE file = CreateFile(fpath.c_str(), GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_ALWAYS, NULL, NULL);
        while (true) {
            DWORD used;
            ReadFile(file, buf, MAX, &used, NULL);
            if (used < MAX) break;
        }
        CloseHandle(file);
        clock_t end = clock();
        cout << end-start << endl;
    }
    system("PAUSE");
}

Times are:
185
80
78

Well... looks like using the C-style fread is faster than ifstream::read. As well, using the windows ReadFile gives only a slight advantage which is negligible (I looked at the code and fread basically is a wrapper around ReadFile). Looks like I'll be switching to fread after all.

Man it is confusing to write a benchmark which actually tests this stuff correctly.

CONCLUSION: Using <cstdio> is faster than <fstream>. The reason fstream is slower is because c++ streams have their own internal buffer. This results in extra copying whenever you read/write and this copying accounts for the entire extra time taken by fstream. Even more shocking is that the extra time taken is longer than the time taken to actually read the file.

retep998
  • 888
  • 1
  • 9
  • 14
  • `xsgetn` is generally where the actual file IO is done. – Nicol Bolas Apr 25 '12 at 22:19
  • 1
    cstdio all the way, plus a custom class based on your access patterns (for pre-fetching, in-memory deferred writes, etc.) – std''OrgnlDave Apr 25 '12 at 22:20
  • 1
    Did you get a chance to try it with the C File API? The iostreams library is notoriously slow. – Doug T. Apr 25 '12 at 22:20
  • 2
    See also: http://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python and http://stackoverflow.com/questions/4340396/does-the-c-standard-mandate-poor-performance-for-iostreams-or-am-i-just-deali – Adam Rosenfield Apr 25 '12 at 22:21
  • Using Boost.Interprocess' memory-mapped files _is not_ platform dependent and _will_ solve your issue. – ildjarn Apr 26 '12 at 01:01
  • Is there any way I can do this without relying on Boost? I know it's awesome at all, but please, I want to do this using just standard C/C++. – retep998 Apr 26 '12 at 01:30
  • 2
    Why don't you use the best solution available instead of making artificial limitations? The best solution is memory-mapped files, and unless you want to write the tedious cross-platform implementations yourself, Boost.Interprocess has the best cross-platform memory-mapped file implementation. If standard C++ had this available, _it wouldn't be in Boost_. – ildjarn Apr 26 '12 at 03:26
  • @ildjarn what you mean to say is, if C++ had this available, then it probably would've been poached from Boost – std''OrgnlDave Apr 26 '12 at 03:45
  • @std''OrgnlDave : Exactly. Here's hoping Interprocess makes it into TR2. – ildjarn Apr 26 '12 at 03:47
  • Make sure you are running in release mode. There is no reason why C++'s read should be slower than C. E.g. http://insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html – user904963 Nov 22 '14 at 23:14

3 Answers3

6

Can I somehow tell the ifstream to buffer the size I need directly into my buffer?

Yes, this is what pubsetbuf() is for.

But if you're that concerned with copying whlie reading a file, consider memory mapping as well, boost has a portable implementation.

Cubbi
  • 46,567
  • 13
  • 103
  • 169
  • 1
    I've already looked into using pubsetbuf but how would I then go about reading a specific sized chunk into that buffer? – retep998 Apr 25 '12 at 22:21
2

If you want to speed up file I/O I suggest you to use the good ol' <cstdio> because it can outperform the C++ one by a large margin.

orlp
  • 112,504
  • 36
  • 218
  • 315
  • May I ask why? I know it avoids a lot of virtual function calls and class overhead, but what does cstdio do in terms of the internal buffer? – retep998 Apr 25 '12 at 22:22
  • 2
    @retep998: Classes don't have overhead. Virtual functions *do*. It's not the buffering that's the problem; `xsgetn` is doing file IO. But it's also doing things like calling `codecvt` for the characters. – Nicol Bolas Apr 25 '12 at 22:25
  • @repet998: that's because you read per character. Read in chunks. – orlp Apr 26 '12 at 00:59
  • Actually it was because I didn't have binary mode enabled for the files so they were hitting EOF marks and ending. Now I can see that cstdio is faster than ifstream. – retep998 Apr 26 '12 at 03:24
  • 3
    This is just false. An idiomatic solution in C++ that uses `istream::read` with compiler optimizations is just as fast as a C solution. e.g. http://insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html I confirmed these results myself with VS2013. – user904963 Nov 22 '14 at 23:13
  • 1
    @user904963 You read the entire file into memory at once, try as an benchmark to read a very large (at least twice as large as your total RAM) file, and XOR all the bytes together to prevent optimization. It is possible to achieve C style I/O speeds by using the right flags on the C++ streams, and only using the `read` method, but then you're basically using C style I/O using C++ streams. Notice that in the linked blog post every single C++ style I/O solution was slower than the C style one, except for the one where you're basically doing the one to one mapping using iostream methods. – orlp Nov 23 '14 at 10:34
  • @user904963 (continuation) And if you want really good speeds you should even forego C style iostreams and optimize using the OS directly using memory mappings or other methods. – orlp Nov 23 '14 at 10:35
  • @orlp I don't think you are able to read. Using `istream::read` is just as fast as C code with compiler optimizations on. I wasn't inviting you into a debate since I'm right here. You can just try it out yourself. – user904963 Nov 25 '14 at 08:49
1

It has been proven several times that the fastest way of reading data is mmap() on linux systems. I don't know about Windows. However it for sure will do without this buffering.

fopen(), fread(), fwrite() (FILE*) is somewhat higher-level and may induce a buffer, while open(), read(), write() functions are low level and the only buffer you may have there come from the Os kernel.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
ypnos
  • 50,202
  • 14
  • 95
  • 141