0

I have a fstream that seems to get into a phantom failure state, even though examining it (via conversion to bool) reveals no error flags. Subsequent reads fail, which is unexpected.

#include <fstream>
#include <iostream>
#include <cassert>

int main()
{
    const unsigned int SAMPLE_COUNT = 2;

    // Setup - create file with two "samples"; each sample has a double and a float
    {
        std::ofstream ofs("/tmp/ffs", std::ios::binary);

        const double colA[SAMPLE_COUNT] = {  1.0,    2.0 };
        const float  colB[SAMPLE_COUNT] = { 42.0,  100.0 };

        for (size_t i = 0; i < SAMPLE_COUNT; i++) {
            ofs.write((char*)&colA[i], sizeof(colA[i]));
            ofs.write((char*)&colB[i], sizeof(colB[i]));
        }
    }

    // Actual testcase
    {
        std::fstream fs("/tmp/ffs", std::ios::binary | std::ios::out | std::ios::in);
        assert(fs);

        unsigned int sample_n = 0;
        while (true) {
            double a;
            fs.read((char*)&a, sizeof(a));

            if (!fs) {
                std::cerr << "No more samples\n";
                break;
            }

            std::cerr << "Sample " << (++sample_n) << " first column = " << a << '\n';


            // Read column B
            float b;
            fs.read((char*)&b, sizeof(b));
            assert(fs);
            std::cerr << "Current value second column = " << b << '\n';

            // Multiply it by two and write back
            b *= 2;
            fs.seekp(-sizeof(b), std::ios::cur);
            fs.write((char*)&b, sizeof(b));
            assert(fs);

            #ifdef DO_THE_TELLG
                // Unless I do this, the `fs.read` on the next iteration fails!
                // So the loop ends, and I get only the first sample transformed in my file.
                fs.tellg();
            #endif
        }
    }

    // Inspection - should see values 84 and 200, but see 84 and 100 instead.
    {
        std::ifstream ifs("/tmp/ffs", std::ios::binary);

        std::cerr << "All values at end:\n";
        for (size_t i = 0; i < SAMPLE_COUNT; i++) {

            double a;
            float  b;

            ifs.read((char*)&a, sizeof(a));
            ifs.read((char*)&b, sizeof(b));

            std::cerr << a << '\t' << b << '\n';
        }

        assert(ifs);
    }
}

Observe in the following output, only the first sample was "parsed", and so at the end the second sample retains its original value 100:

[root@localhost ~]# ./test
Sample 1 first column = 1
Current value second column = 42
No more samples
All values at end:
1       84
2       100

If I perform a tellg() or tellp() operation on it, the subsequent read succeeds, so the loop is not prematurely ended and the second sample is also multiplied by 2, to produce 200:

[root@localhost ~]# ./test
Sample 1 first column = 1
Current value second column = 42
Sample 2 first column = 2
Current value second column = 100
No more samples
All values at end:
1       84
2       200

This only occurs for me in the following environment:

  • CentOS 6 x86_64, GCC 4.4.7
  • CentOS 6 x86_64, GCC 4.8.2 (via devtoolset-2)

I get the expected behaviour, with or without tellg()/tellp(), on:

(Where a listed compiler supports both C++03 and C++11, I have tried it under both and observed no difference.)

Does my program have UB? Or, is this a libstdc++ bug that I need to work around?


Update: okay, so this is a known thing. But Dietmar doesn't say whether this is standard-mandated, or a libstdc++ bug. To me, it looks like bug 40732, but that was RESOLVED/WONTFIX, so why would my program work as expected under Coliru and CentOS 7? Ideally I'd like to get a better handle on precisely what's going on before putting this workaround in place.

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • It is standard-mandated. See e.g. http://stackoverflow.com/questions/17536570/reading-and-writing-to-the-same-file-using-the-same-fstream – n. m. could be an AI Feb 08 '17 at 15:25
  • @n.m.: _"So your stream.seekp(1) solution, which devolves to a C fseek, is correct."_ Is it mandated that that's how seekp works? And how come it works on CentOS 7 then? The accepted answer there claims that the GNU C library doesn't have this restriction but that's not my experience here on CentOS 6. So now I'm even more confused! – Lightness Races in Orbit Feb 08 '17 at 15:37
  • The claim about glibc may or may not be correct, or may or may not refer to only some glibc versions. I have never seen it confirmed in the documentation. – n. m. could be an AI Feb 08 '17 at 16:56
  • This workaround has been in place for months and apparently works great. But I'd still like to know the official word for peace of mind. – Lightness Races in Orbit Jan 22 '18 at 01:14

0 Answers0