1

I need to open an existing file and write to any arbitrary place of that file. Also at a position which might be larger than the current size of the file.

Opening the file with "ab" will set the position indicator to the end of the file every time a write operation is invoked - so that won't work.

Opening the file with "w+b" or "wb" results in the file being written several times (copied?). Filesize starts over at 0 several times - and it takes a long time. See video of what happens when the test below is executed (1 run): http://screencast.com/t/Uj5ymikZUYJ

BOOST_AUTO_TEST_CASE(FileWriteTest_W_PLUS_B) {
    auto started = chrono::high_resolution_clock::now();

    FILE *filePointer = nullptr;
    auto tmpFilename = string("C:\\temp\\") + boost::uuids::to_string(boost::uuids::random_generator()());
    auto bufferSize = 1024 * 1024;

    unique_ptr<unsigned char[]> buffer(new unsigned char[bufferSize]);
    RAND_bytes(buffer.get(), bufferSize);

    for (long long i = 0; i < 5; i++) {
        //Open file
        int openError = fopen_s(&filePointer, tmpFilename.c_str(), "w+b");  
        if (openError != 0)
            BOOST_FAIL(string("Failed to open file ") + tmpFilename);

        auto CurrentPosition = 1024LL * 1024LL * 1024LL * i;

        //Set position to 0/1/2/3/4 GB 
        fsetpos(filePointer, &CurrentPosition);

        //Write 1 GB of data at current position
        for (int n = 0; n < 1024; n++) {
            int written = fwrite(buffer.get(), sizeof(unsigned char), bufferSize, filePointer);
            if (written != bufferSize) {
                BOOST_FAIL(string("Unable to write ") + to_string(bufferSize) + string(" to file ") + tmpFilename + string(" at position ") + to_string(CurrentPosition));
            }
        }

        //Close file
        fclose(filePointer);
    }

    auto ended = chrono::high_resolution_clock::now();
    cout << "Time :" << duration_cast<duration<double>>(ended - started).count() << " seconds";
}

So my question is: Is there any way to open an existing file and write to an arbitrary position (also at a position larger than the current size) - without getting the performance penalty that I'm currently seing with "wb" / "w+b"

Or do I have to make the file its final size - the first time I'm writing to it? (Which e.g. Torrent clients seem to do).

fstreams are not an option due to their poor I/O performance. (See writing-a-binary-file-in-c-very-fast)

Community
  • 1
  • 1
  • 4
    _"fstreams are not an option due to their poor I/O performance"_ [That's a myth](http://stackoverflow.com/q/17468088/560648), propagated by poor coding. – Lightness Races in Orbit Aug 22 '16 at 13:32
  • 2
    @LightnessRacesinOrbit: Please see http://stackoverflow.com/questions/11563963/writing-a-binary-file-in-c-very-fast – Njål Arne Gjermundshaug Aug 22 '16 at 13:39
  • 1
    Now click on the link in my previous comment, and also read http://stackoverflow.com/q/12997131/560648. I do not dispute that many people _claim_ fstreams are inherently much slower. I only claim that in almost all cases this observation is down to poor code and/or broken benchmarks. – Lightness Races in Orbit Aug 22 '16 at 14:22
  • @LightnessRacesinOrbit Perhaps you could provide me with some blazing fast fstream code that outperforms FILE * ? Would love to benchmark this. – Njål Arne Gjermundshaug Aug 22 '16 at 14:29
  • Sure! If the information provided in the linked Q&As isn't good enough. What's your budget? – Lightness Races in Orbit Aug 22 '16 at 14:51
  • 1
    Just tested on MSVC2015U3, Windows 10 x64, fstream still seems to be CPU-bound. But!! On FreeBSD and Ubuntu fstream in fact was faster, and used less CPU than fopen. [my test code](http://rextester.com/TXIR55266) – rustyx Aug 22 '16 at 15:48

2 Answers2

4

You should open the stream in "r+b" mode. "w" mode causes the file to be truncated. If the file does not exist, it must first be created with "wb".

Note however that fsetpos() may fail to set the current position beyond the end of the file. You should check the return value and pad the file to the target position if needed.

For positions to be meaningful as byte offsets, the stream must also be open in binary mode with b. Streams are open in text more by default, which on some systems such as Windows may prevent correct offset management.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
0

I suggest you manually keep track of the file size and if the size is too small, use something like this:

fseek(f, 0, SEEK_END);
int pos = ftell(f);

while (pos < wantedsize) {
  fputc(0, f); ++pos;
}

beware it has problems with files exceeding ~2GB.

Sven Nilsson
  • 1,861
  • 10
  • 11