20

I've seen at least two ways of reading lines from a file in C++ tutorials:

std::ifstream fs("myfile.txt");
if (fs.is_open()) {
  while (fs.good()) {
    std::string line;
    std::getline(fs, line);
    // ...

and:

std::ifstream fs("myfile.txt");
std::string line;
while (std::getline(fs, line)) {
  // ...

Of course, I can add a few checks to make sure that the file exists and is opened. Other than the exception handling, is there a reason to prefer the more-verbose first pattern? What's your standard practice?

Andrew Barber
  • 39,603
  • 20
  • 94
  • 123
jda
  • 265
  • 2
  • 6
  • It's not *quite* an exact duplicate, but the answers to a [previous question](http://stackoverflow.com/questions/1567082/how-do-i-iterate-over-cin-line-by-line-in-c/1567703) are relevant. – Jerry Coffin Aug 28 '11 at 16:51

5 Answers5

27
while (std::getline(fs, line))
{}

This is not only correct but preferable also because it is idiomatic.

I assume in the first case, you're not checking fs after std::getline() as if(!fs) break; or something equivalent. Because if you don't do so, then the first case is completely wrong. Or if you do that, then second one is still preferable as its more concise and clear in logic.

The function good() should be used after you made an attempt to read from the stream; its used to check if the attempt was successful. In your first case, you don't do so. After std::getline(), you assume that the read was successful, without even checking what fs.good() returns. Also, you seem to assume that if fs.good() returns true, std::getline would successfully read a line from the stream. You're going exactly in the opposite direction: the fact is that, if std::getline successfully reads a line from the stream, then fs.good() would return true.

The documentation at cplusplus says about good() that,

The function returns true if none of the stream's error flags (eofbit, failbit and badbit) are set.

That is, when you attempt to read data from an input stream, and if the attempt was failure, only then a failure flag is set and good() returns false as an indication of the failure.

If you want to limit the scope of line variable to inside the loop only, then you can write a for loop as:

for(std::string line; std::getline(fs, line); )
{
   //use 'line'
}

Note: this solution came to my mind after reading @john's solution, but I think its better than his version.


Read a detail explanation here why the second one is preferable and idiomatic:

Or read this nicely written blog by @Jerry Coffin:

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • yikes, the top google hit for me was http://www.cplusplus.com/doc/tutorial/files/ How in the world did the method name good() make it into the standard? :) – jda Aug 28 '11 at 04:45
  • @jda: The function `good()` should be used *after* you made an attempt to read from the stream; its used to check if the attempt was successful. In your first case, you don't do so. After `std::getline()`, you assume that the read was successful, without even checking what `fs.good()` returns. Also, you seem to assume that if `fs.good()` returns true, `std::getline` would successfully read a line from the stream. You're going exactly in the opposite direction: the fact is that, if `std::getline` successfully reads a line from the stream, then `fs.good()` would return `true`. – Nawaz Aug 28 '11 at 05:28
  • @jda: I added some more explanation to my answer. Please read it. – Nawaz Aug 28 '11 at 05:35
  • Thanks for the clarification. I was pretty much copying the cplusplus.com example, which seems to do this incorrectly, as you point out. – jda Aug 28 '11 at 05:59
  • 1
    @Nawaz: An alternative that limits scope and that still retains use of the standard idiom is to put the declaration and the loop into a block: `{ std::line line; while (getline(fs, line)) {...} }`. Not that I would use this; I just use the standard idiom. BTW, very nice post. – David Hammen Aug 28 '11 at 07:00
  • 1
    The function `good()` should _never_ be used. If the preceding input succeeded, it may or may not return `true`; you don't know and cannot count on it. If the preceding input failed, it will return `false`, so there's no point in calling it. – James Kanze Jun 24 '13 at 08:57
  • What if the last line of the file is not NL-terminated? – lav Mar 23 '20 at 06:50
  • @lav: Not an issue. Please try it. – Nawaz Mar 23 '20 at 06:52
  • @Nawaz: Correct. std::getline only sets badbit if it could read no characters at all. It sets eofbit if the line is not NL-terminated, but it that case istream's operator bool still returns true, so the line is processed correctly. – lav Mar 23 '20 at 07:22
4

Think of this as an extended comment to Nawaz' already excellent answer.

Regarding your first option,

while (fs.good()) {
  std::string line;
  std::getline(fs, line);
  ...

This has multiple problems. Problem number 1 as that that the while condition is in the wrong place and is superfluous. It's in the wrong place because fs.good() indicates whether or not the most recent action performed on the file was OK. A while condition should be with respect to the upcoming actions, not the previous ones. There is no way to know whether the upcoming action on the file will be OK. What upcoming action? fs.good() does not read your code to see what that upcoming action is.

Problem number two is that the you are ignoring the return status from std::getline(). That's OK if you immediately check the status with fs.good(). So, fixing this up a bit,

while (true) {
  std::string line;
  if (std::getline(fs, line)) {
    ...
  }
  else {
     break;
  }
}

Alternatively, you can do if (! std::getline(fs, line)) { break; } but now you have a break in the middle of the loop. Yech. It is much, much better to make the exit conditions a part of the loop statement itself if at all possible.

Compare that to

std::string line;
while (std::getline(fs, line)) {
  ...
}

This is the standard idiom for reading lines from a file. A very similar idiom exists in C. This idiom is very old, very widely used, and very widely viewed as the correct way to read lines from a file.

What if you come from a shop that bans conditionals with side-effects? (There are lots and lots of programming standards that do just that.) There is a way around this without resorting to the break in the middle of the loop approach:

std::string line;
for (std::getline(fs, line); fs.good(); std::getline(fs, line)) {
  ...
}

Not as ugly as the break approach, but most will agree that this isn't nearly as nice-looking as is the standard idiom.

My recommendation is to use the standard idiom unless some standards idiot has banned its use.

Addendum
Regarding for (std::getline(fs, line); fs.good(); std::getline(fs, line)): This is ugly for two reasons. One is that obvious chunk of replicated code.

Less obvious is that calling getline and then good breaks atomicity. What if some other thread is also reading from the file? This isn't quite so important right now because C++ I/O currently is not threadsafe. It will be in the upcoming C++11. Breaking atomicity just to keep the enforcers of the standards happy is recipe for disaster.

David Hammen
  • 32,454
  • 9
  • 60
  • 108
  • If anyone thinks your final example is better than your penultimate one, they really are a 'standards idiot'. Code repetition is bad. – john Aug 28 '11 at 06:30
  • @john: There are, unfortunately, plenty of people who do think my final example is preferable. The small amount of replicated code doesn't bother me near so much as does the loss of atomicity. The same concept applies to function that is *supposed to* perform a complex transaction in a mulithreaded environment (but sometimes the transaction fails). There are plenty of places where programming with side-effects is not only acceptable but is the only way to go. – David Hammen Aug 28 '11 at 06:49
  • The idium was Inhereted from older languages and is practically universal – Martin York Aug 28 '11 at 15:56
  • @Martin: I said that: "A very similar idiom exists in C. This idiom is very old, very widely used, and very widely viewed as *the* correct way to read lines from a file." – David Hammen Aug 28 '11 at 16:41
2

Actually I prefer another way

for (;;)
{
  std::string line;
  if (!getline(myFile, line))
    break;
  ...
}

To me it reads better, and the string is scoped correctly (i.e. inside the loop where it is being used, not outside the loop)

But of the two you've written the second is correct.

john
  • 85,011
  • 4
  • 57
  • 81
  • To me, an infinite loop with a break in the middle is on the border of being an anti-pattern. – David Hammen Aug 28 '11 at 05:52
  • 1
    If you care for scope of the string variable, then why don't you simply write `for(string line; getline(fs,line); ) {}`? – Nawaz Aug 28 '11 at 05:53
  • @David Hammen: Break at the top I think you'll find. The scoping issue is the real gain. But there you go, style issues, people never agree. – john Aug 28 '11 at 05:54
  • 1
    @Nawaz: I think because when I started programming C++ that wasn't possible. But good point. – john Aug 28 '11 at 05:55
  • @David Hammen, I used to program Pascal, Pascal doesn't have a break and you were for ever seeing this kind of loop `code; while (condition) { ...; code; }` Now that is an anti-pattern because of the repetition of `code`. Now you can clean that up with `while (function(...)) { ... }` where function incorporates both `code;` and `condition` or you can do it my way `for (;;) { code; if (!condition) break; ... }`. To me both are reasonable depending on which produces the cleaner result, which is of course subjective. – john Aug 28 '11 at 06:20
  • @john: If your driving concern is scoping, you can still use the standard idiom via `{std::string line; while (std::getline(fs, line)) {...}}`. A driving reason to use *the* standard idiom is that any reasonably intelligent programmer who reads your code (say for a code review) will immediately understand what you are doing. Do something different and the programmer will wonder why you are doing this and will also question your programming savvy -- and usually for good reason. – David Hammen Aug 28 '11 at 06:44
  • @Cheers David! I love you to. – john Aug 28 '11 at 06:54
  • Another problem with this is that the string gets allocated/dealocated each time round the loop.I'm with `for(std::string line; std::getline(is, line);){}` – Galik Jun 20 '16 at 17:01
1

The first one deallocated and re-allocated the string every loop, wasting time.
The second time writes the string to an already existing space removing the deallocation and reallocation, making it actually faster (and better) than the first one.

Daniel
  • 30,896
  • 18
  • 85
  • 139
  • 2
    Thanks. I'm not so concerned about that though. I could always move the declaration outside the loop, but I'm told (I could be wrong about this) that C++ compilers are smart about allocation and the best practice is to move the declaration as close to the usage as possible. – jda Aug 28 '11 at 04:48
  • @jda, i'm not sure that compiler would be smart enough to optimise the allocation (I could be wrong) but your second point is what matters. Don't micro-optimise unless you've a demonstrated need too, write clean code first. – john Aug 28 '11 at 05:45
  • An even halfway decent optimizing compiler will get rid of that repeated allocation and deallocation. This just isn't a concern unless you are compiling with optimization off. When you compile C++ with optimization off you will have lots and lots of performance concerns, this being just a tiny part of those concerns. – David Hammen Aug 28 '11 at 05:50
  • @David Hammen: actually no. Using LLVM compiler I tested the the compiler assembly outputs with `-O3` optimization and saw that `new` is being called in a loop when the string in the loop and called only one when the string is out of it. for reference the code is http://pastebin.com/6ErddNSS – Daniel Aug 28 '11 at 14:02
-5

Try this =>

// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
  string line;
  ifstream myfile ("example.txt");
  if (myfile.is_open())
  {
    while ( myfile.good() )
    {
      getline (myfile,line);
      cout << line << endl;
    }
    myfile.close();
  }

  else cout << "Unable to open file"; 

  return 0;
}
cola
  • 12,198
  • 36
  • 105
  • 165