1

C# coder just wrote this simple C++ method to get text from a file:

static std::vector<std::string> readTextFile(const std::string &filePath) {
    std::string line;
    std::vector<std::string> lines;
    std::ifstream theFile(filePath.c_str());
    while (theFile.good()) {
    getline (theFile, line);
        lines.push_back(line);
    }
    theFile.close();
    return lines;
}   

I know this code is not efficient; the text lines are copied once as they are read and a second time when returned-by-value.

Two questions:

(1) can this code leak memory ? (2) more generally can returning containers of objects by value ever leak memory ? (assuming the objects themselves don't leak)

tpascale
  • 2,516
  • 5
  • 25
  • 38
  • You probably don't want `static` everywhere like you do in C#. Also, containers take care of freeing memory when they are destroyed, which, when allocated on the stack like you have, is when they go out of scope. – chris Aug 30 '12 at 19:56
  • Actually return-by-value doesn't necessarily leads to a copy. It can possibly be avoided due to [RVO](http://en.wikipedia.org/wiki/NRVO) or, if c++11 is used and RVO is impossible the return value is moved instead of beeing copied (moving a `vector` costs about three pointer assignments, so that doesn't really cost much). And no, returning objects by value doesn't leak memory if the object itself is correct. In c++ you are generally fine with regards to leaking memory unless you start using `new` – Grizzly Aug 30 '12 at 20:01
  • 1
    Better: `while (std::getline(theFile, line))`, otherwise you get an extra empty line at eof. Apart from that, the code is fine. In C++11 you have idiomatic ways to avoid the extra copies. In C++03, you often prefer the extra copies to convolved code. – Alexandre C. Aug 30 '12 at 20:09

5 Answers5

5
while (theFile.good()) {
 getline (theFile, line);
    lines.push_back(line);
}

Forget about efficiency, this code is not correct. It will not read the file correctly. See the following topic to know why:

So the loop should be written as:

while (getline (theFile, line)) {
    lines.push_back(line);
}

Now this is correct. If you want to make it efficient, profile your application first. Try see the part which takes most CPU cycles.


(1) can this code leak memory ?

No.

(2) more generally can returning containers of objects by value ever leak memory ?

Depends on the type of the object in the containers. In your case, the type of the object in std::vector is std::string which makes sure that no memory will be leaked.

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
1

No and no. Returning by value never leaks memory (assuming the containers and the contained objects are well written). It would be fairly useless if it was any other way.

And I second what Nawaz says, your while loop is wrong. It's frankly incredible how many times we see that, there must be an awful lot of bad advice out there.

john
  • 85,011
  • 4
  • 57
  • 81
  • thanks - fyi I got the loop construct at "www.cplusplus.com". I guess even with a name like that one has to be careful about clipping snippets. – tpascale Aug 30 '12 at 20:53
  • @tpascale Would be interested to know the exact address. – john Aug 30 '12 at 21:07
  • http://www.cplusplus.com/doc/tutorial/files/ and see the 2nd example in the section "Text Files". – tpascale Aug 30 '12 at 22:33
  • Interesting, the text is actually OK, it implies that good() is false *after* the last I/O operation fails. But if the author knew that why did they write the loop that way? – john Aug 31 '12 at 07:43
1

(1) can this code leak memory ?

No

(2) more generally can returning containers of objects by value ever leak memory ?

No. You might leak memory that is stored in a container by pointer or through objects that leak. But that would not be caused by returning by value.

I know this code is not efficient; the text lines are copied once as they are read and a second time when returned-by-value.

Most probably not. There are two copies of the string, but not the ones that you are thinking about. The return copy will most likely be optimized in C++03, and will either be optimized away or transformed into a move (cheap) in C++11.

The two copes are rather:

getline (theFile, line);
lines.push_back(line);

The first line copies from the file to line, and the second copies from line to the container. If you are using a C++11 compiler you can change the second line to:

lines.push_back(std::move(line));

to move the contents of the string to the container. Alternatively (and also valid in C++03), you can change the two lines with:

lines.push_back(std::string()); // In most implementations this is *cheap*
                                // (i.e. no memory allocation)
getline(theFile, lines.back());

And you should test the result of the read (if the read fails, in the last alternative, make sure to resize to one less element to remove the last empty string.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • To remove the last element, you can use `pop_back`. That's way simpler than getting the length, subtracting 1 and `resize`ing. – Jens Kilian Aug 31 '12 at 06:51
1

In C++11, you can do:

std::vector<std::string> 
read_text_file(const std::string& path) 
{
    std::string line;
    std::vector<std::string> ans;
    std::ifstream file(path.c_str());

    while (std::getline(file, line))
       ans.push_back(std::move(line));

    return ans;
}

and no extra copies are made.

In C++03, you accept the extra copies, and painfully remove them only if profiling dictates so.

Note: you don't need to close the file manually, the destructor of std::ifstream does it for you.

Note2: You can template on the char type, this may be useful in some circumstances:

template <typename C, typename T>
std::vector<std::basic_string<C, T>>
read_text_file(const char* path)
{
    std::basic_string<C, T> line;
    std::vector<std::basic_string<C, T>> ans;
    std::basic_ifstream<C, T> file(path);

    // Rest as above
}
Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
0

No, returning container by value should not leak memory. The standard library is designed not to leak memory itself in any case. It can only leak a memory if there is a bug in its implementation. At least there used to be a bug in vector of strings in an old MSVC.

Juraj Blaho
  • 13,301
  • 7
  • 50
  • 96