5

These are the things I cannot change:

  • The language is C++
  • However, the file is opened with the good old fopen()
  • The file is not opened in binary mode

This is what I have to do:

  • Write a function that loads the entire file into a std::string. Lines should be separated only by \n and not other variants.

This is what I did:

string ReadWhole()
{
    Seek(0);
    char *data = new char[GetSize()];

    if (1 != fread(data, GetSize(), 1, mFile))
        FATAL(Text::Format("Read error: {0}", strerror(errno)));

    string ret(data, GetSize());
    delete[] data;
    return ret;
}

For reference, this is GetSize, but it simply returns the size of the file (cached):

int GetSize()
{
    if (mFileSize)
        return mFileSize;

    const int current_position = ftell(mFile);
    fseek(mFile, 0, SEEK_END);
    mFileSize = ftell(mFile);
    fseek(mFile, current_position, SEEK_SET);

    return mFileSize;
}

This is the problem

fread() fails because the file has \r\n line endings and they count as only 1 character instead of 2, so it tries to read more than the characters in the file.

I could fix it with fgets but I was wondering if there was a better way. Thanks.

pmg
  • 106,608
  • 13
  • 126
  • 198
hbyy
  • 53
  • 3
  • Great first question. Welcome to SO! – F. P. Dec 12 '10 at 16:54
  • no `std::string` in C. Tag removed. – pmg Dec 12 '10 at 17:29
  • 1
    It appears you're initializing mFileSize to zero and using that to indicate the value has not yet been cached. However, since zero is a valid file size, you should use a different sentinel value, such as -1. – Fred Nurk Dec 12 '10 at 19:50
  • Note that there's a potential memory leak in your code. If `string ret(...);` throws `std::bad_alloc`, you'll leak the buffer in `data`. – André Caron Dec 12 '10 at 20:39

4 Answers4

3

After fread returns that it was unable to read the requested number of bytes, you should simply check ferror(mFile). If it's 0 (false) then fread just stopped at the end of the file and you should not tread this as an error. You should switch the two arguments though so you can get the number of bytes actually read:

size_t number_of_bytes_read = fread(data, 1, GetSize(), mFile);
R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
2

There is a trivial, idiomatic way to perform this operation.

#include <string>
#include <fstream>
#include <sstream>
std::string load_file ( const std::string& path ) 
{
    std::ostringstream contents;
    std::ifstream file(path);
    if ( !file.is_open() ) {
        // process error.
    }
    contents << file.rdbuf();
    return (contents.str());
}

Note: this function does not use seeking to obtain the size (in bytes) of the input file. This has the down-side of (few) re-allocations to increase the buffer as more input is made available. It has the up-side of working with other std::istream implementations, which might not be able to provide the contents' size ahead of time (i.e. reading from a socket).

Edit: because your requirements state use of FILE*, which is already opened and you cannot change, you can implement a std::streambuf implementation that uses an existing FILE* to allow re-use of high-level std::istream and std::ostream operations.

A sample implementation is available right here, on StackOverflow.

P.S.: If you've never used non-standard-library stream buffer implementations, here's a quick overview of how to write the function given the implementation I pointed to.

#include <string>
#include <istream>
#include <sstream>
#include "FILEbuf.h"
std::string load_file ( ::FILE * opened_c_file ) 
{
    FILEbuf buffer(opened_c_file);
    std::istream file(&buffer);
    std::ostringstream contents;
    contents << file.rdbuf();
    return (contents.str());
}
Community
  • 1
  • 1
André Caron
  • 44,541
  • 12
  • 67
  • 125
  • Wait a second, updating my answer to use `FILE*`. – André Caron Dec 12 '10 at 16:43
  • Why !file.is_open() instead of !file? Current-standard ifstreams don't have a std::string ctor. – Fred Nurk Dec 12 '10 at 19:52
  • No need to construct an istream on &buffer just to extract it again: contents << &buffer. – Fred Nurk Dec 12 '10 at 19:53
  • @Fred. 1) `operator!` checks other stuff, like the status of the last output operation. I find `!is_open()` to be clearer after using the constructor. 2) I forgot to call `.c_str()`. 3) In this case, the buffer is not required, but someone might want to use the `istream` object directly for other purposes. – André Caron Dec 12 '10 at 20:32
0

You could allocate a fixed-size buffer, and repeatedly fread at most that much from the file and append that to the string with string::apeend(char*, size_type).

jpalecek
  • 47,058
  • 7
  • 102
  • 144
0

Just use fgetc() to read one character at a time. You can use a special case to convert '\r\n' endings to plain '\n's.

std::string ReadWhole() {
    std::string ret;
    char prev = 0, c;
    while ((c = fgetc(mFile)) != EOF) {
        if (prev == '\r' && c == '\n') {
            ret.erase(ret.rend()); // erase the previous \r
        }
        ret += c;
        prev = c;
    }
    return ret;
}
SoapBox
  • 20,457
  • 3
  • 51
  • 87
  • The file is opened in text mode ("The file is not opened in binary mode" from the question), so newline conversions are already done by stdio. – Fred Nurk Dec 12 '10 at 19:05