0

There are numerous questions dealing with counting the number of lines in a file. I have successfully implemented this, using std::count, as per suggested in this question.

My present challenge is that I need to have some robustness when dealing with the files -- some of these files might have a blank line at the end, other times it might not. Some of the files have UNIX line endings, other have Windows line endings.

I have tried looking for a '\n' at an offset of -1 from the end of the input stream, but this has not been successful. The code would look something along these lines:

std::ios::pos_type current_location = is_->tellg();
is_->seekg(0);
auto saved_flags = is_->flags();
uint64_t total_records(0);

total_records = std::count(std::istreambuf_iterator<char>(*is_),
                           std::istreambuf_iterator<char>(), '\n');

// Check that the last character before the end of the file is not a '\n'
is_->seekg(-1,std::ios_base::end);
if (is_->peek() == '\n')
     total_records--;

// restore the saved position and flags
is_->seekg(current_location);
is_->flags(saved_flags);

return total_records ? ++total_records : 0;

However, this does not work -- the count is not decrmented by 1, so this function returns a count with 1 number too many records.

Obviously, this is a trivial problem if I can mandate that all files must have a trailing newline or must not have the trailing newline. I feel like allowing for both possibilities should not be that difficult, and that I am missing something obvious here.

UPDATE: Just as a matter of clarification, this is not a homework question. This is for a personal project that I have been working on.

Any help would be greatly appreciated.

Thanks- Shmuel

Community
  • 1
  • 1
Shmuel Levine
  • 550
  • 5
  • 18
  • You can use `getline` function to parse a stream one line at a time. It will, I believe, also grab the last line if it doesn't end with a new line. You can check the size of the resulting string or if it contains any whitespace you want to discard. As for different line endings, that complicates matters. You can use a tool to convert your files. – Neil Kirk Oct 20 '14 at 02:38
  • @NeilKirk, thanks for the suggestion. I was going to respond that I've been under the impression (i.e. read a few times) that using std::count with the iterator approach that I've written above will be much faster; however, I realize that I don't have much basis for saying that and I have never actually tested that assumption. So- I will give that a shot and try and compare the performance of the two approaches. Do you have any idea why my code did not work? Is there something wrong in the logic? – Shmuel Levine Oct 20 '14 at 13:23
  • Both are going to inspect every character of the file so I wouldn't expect a significant difference. But I've been surprised before. Remember to reuse the same string object each call. I don't know why your code doesn't work as I don't usually write it this way. I try to read an object at a time until an object fails to read. – Neil Kirk Oct 20 '14 at 13:29

0 Answers0