22

I've read that <fstream> predates <exception>. Ignoring the fact that exceptions on fstream aren't very informative, I have the following question:

It's possible to enable exceptions on file streams using the exceptions() method.

ifstream stream;
stream.exceptions(ifstream::failbit | ifstream::badbit);
stream.open(filename.c_str(), ios::binary);

Any attempt to open a nonexistent file, a file without the correct permissions, or any other I/O problem will results in exception. This is very good using an assertive programming style. The file was supposed to be there and be readable. If the conditions aren't met, we get an exception. If I wasn't sure whether the file could safely be opened, I could use other functions to test for it.

But now suppose I try to read into a buffer, like this:

char buffer[10];
stream.read(buffer, sizeof(buffer)); 

If the stream detects the end-of-file before filling the buffer, the stream decides to set the failbit, and an exception is fired if they were enabled. Why? What's the point of this? I could have verified that just testing eof() after the read:

char buffer[10];
stream.read(buffer, sizeof(buffer));
if (stream.eof()) // or stream.gcount() != sizeof(buffer)
    // handle eof myself

This design choice prevents me from using standard exceptions on streams and forces me to create my own exception handling on permissions or I/O errors. Or am I missing something? Is there any way out? For example, can I easily test if I can read sizeof(buffer) bytes on the stream before doing so?

ceztko
  • 14,736
  • 5
  • 58
  • 73

3 Answers3

30

The failbit is designed to allow the stream to report that some operation failed to complete successfully. This includes errors such as failing to open the file, trying to read data that doesn't exist, and trying to read data of the wrong type.

The particular case you're asking about is reprinted here:

char buffer[10];
stream.read(buffer, sizeof(buffer)); 

Your question is why failbit is set when the end-of-file is reached before all of the input is read. The reason is that this means that the read operation failed - you asked to read 10 characters, but there weren't sufficiently many characters in the file. Consequently, the operation did not complete successfully, and the stream signals failbit to let you know this, even though the available characters will be read.

If you want to do a read operation where you want to read up to some number of characters, you can use the readsome member function:

char buffer[10];
streamsize numRead = stream.readsome(buffer, sizeof(buffer)); 

This function will read characters up to the end of the file, but unlike read it doesn't set failbit if the end of the file is reached before the characters are read. In other words, it says "try to read this many characters, but it's not an error if you can't. Just let me know how much you read." This contrasts with read, which says "I want precisely this many characters, and it's an error if you can't do it."

EDIT: An important detail I forgot to mention is that eofbit can be set without triggering failbit. For example, suppose that I have a text file that contains the text

137

without any newlines or trailing whitespace afterwards. If I write this code:

ifstream input("myfile.txt");

int value;
input >> value;

Then at this point input.eof() will return true, because when reading the characters from the file the stream hit the end of the file trying to see if there were any other characters in the stream. However, input.fail() will not return true, because the operation succeeded - we can indeed read an integer from the file.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • This is, unfortunately, not helping me in trying to use exceptions in an useful way. I think most frameworks nowadays don't throw exceptions reaching eof on blocking reads ([example](http://msdn.microsoft.com/en-us/library/system.io.filestream.read.aspx)), that was I was trying to achieve. I forgot to mention I'm using the stream in binary mode, if it makes any difference header, and what I was trying to read was a binary header, so using >> operator is not an option. – ceztko Jul 21 '11 at 20:02
  • @cetzko- Can you elaborate on why this doesn't help you? What specific problems are you encountering? My discussion of `read` vs `readsome` was intended to provide a mechanism for avoiding spurious `failbit` exceptions; is this not what you were asking? – templatetypedef Jul 21 '11 at 20:04
  • `readsome()` is said to be not-blocking. While I agree that reading from files I wouldn't have any problems, that would be an assumption that in good programming I wouldn't be supposed to do. – ceztko Jul 21 '11 at 20:13
  • @ceztko- I'm not sure what you mean; the C++ standard has no notion of "blocking" versus "nonblocking" I/O. Where did you read that? – templatetypedef Jul 21 '11 at 20:20
  • So what you want to say is `stream.readsome(buffer, sizeof(buffer))` will **never** return if (*the argument buffer is not filled*) **and** ((*the stream is not eof*) **or** (*the steam is not on fail*))? – ceztko Jul 21 '11 at 20:42
  • I believe that's correct. However, `read` is the same way; it doesn't return until it's either read everything or detected an error. If this is what you mean by "blocking," then all of the C++ I/O routines are blocking. – templatetypedef Jul 21 '11 at 20:45
  • 1
    [This](http://stackoverflow.com/questions/1137351/ifstream-read-vs-ifstream-readsome-in-msvc7-1) discussion seems to point that `readsome` is really a non-blocking read. – ceztko Jul 21 '11 at 20:51
  • 2
    Ah, I see. My apologies - I misread the spec's description of `readsome`. You are correct - `readsome` will just read what's already buffered without necessarily having the stream buffer extract more data from the device. – templatetypedef Jul 21 '11 at 20:59
2

Using the underlying buffer directly seems to do the trick:

char buffer[10];
streamsize num_read = stream.rdbuf()->sgetn(buffer, sizeof(buffer));
absence
  • 821
  • 1
  • 8
  • 13
  • +1: this is a great discovery! It has at least one pitfall: if the containing stream is EOF, eofbit won't be set by the action of sgetn. It can be taken into account by doing a `peek` on the stream immediately after the `sgetn`. Also on Windows it never fails, even in case of real forced failure, but this seems to be an implementation problem. I added another answer with more research but your approach is the correct one. – ceztko Mar 23 '14 at 16:52
1

Improving @absence's answer, it follows a method readeof() that does the same of read() but doesn't set failbit on EOF. Also real read failures have been tested, like an interrupted transfer by hard removal of a USB stick or link drop in a network share access. It has been tested on Windows 7 with VS2010 and VS2013 and on linux with gcc 4.8.1. On linux only USB stick removal has been tried.

#include <iostream>
#include <fstream>
#include <stdexcept>

using namespace std;

streamsize readeof(istream &stream, char *buffer, streamsize count)
{
    if (count == 0 || stream.eof())
        return 0;

    streamsize offset = 0;
    streamsize reads;
    do
    {
        // This consistently fails on gcc (linux) 4.8.1 with failbit set on read
        // failure. This apparently never fails on VS2010 and VS2013 (Windows 7)
        reads = stream.rdbuf()->sgetn(buffer + offset, count);

        // This rarely sets failbit on VS2010 and VS2013 (Windows 7) on read
        // failure of the previous sgetn()
        (void)stream.rdstate();

        // On gcc (linux) 4.8.1 and VS2010/VS2013 (Windows 7) this consistently
        // sets eofbit when stream is EOF for the conseguences  of sgetn(). It
        // should also throw if exceptions are set, or return on the contrary,
        // and previous rdstate() restored a failbit on Windows. On Windows most
        // of the times it sets eofbit even on real read failure
        (void)stream.peek();

        if (stream.fail())
            throw runtime_error("Stream I/O error while reading");

        offset += reads;
        count -= reads;
    } while (count != 0 && !stream.eof());

    return offset;
}

#define BIGGER_BUFFER_SIZE 200000000

int main(int argc, char* argv[])
{
    ifstream stream;
    stream.exceptions(ifstream::badbit | ifstream::failbit);
    stream.open("<big file on usb stick>", ios::binary);

    char *buffer = new char[BIGGER_BUFFER_SIZE];

    streamsize reads = readeof(stream, buffer, BIGGER_BUFFER_SIZE);

    if (stream.eof())
        cout << "eof" << endl << flush;

    delete buffer;

    return 0;
}

Bottom line: on linux the behavior is more consistent and meaningful. With exceptions enabled on real read failures it will throw on sgetn(). On the contrary Windows will treat read failures as EOF most of the times.

ceztko
  • 14,736
  • 5
  • 58
  • 73