3

I like the range for-loop in C++, and want to use it like this:

#include <bits/stdc++.h>

int main()
{
    for (auto s : LineReader("my-very-big-textfile.txt")) {
        cout << s << endl;
    }
    return 0;
}

The purpose here is to iterate through some data (without reading all into a container first). In this case text strings which are lines in a text file. But generally it could be anything (including generated data).

Here LineReader returns an iterable "pseudo"-container. And in order for that to work, the for loop needs iterators from the LineReader object. In C++ the range is defined in terms of a begin and end iterator. But I want to use the range for-loop to iterate through data where the end might not be known at start (for example reading line for line in a (excessively big) text file without going through it first to find the end.).

So I define that like this:

Disclaimer: Example code showing the principle, so therefore I'm not "pestering" it with excessive use of std::, error handling, private/public keywords and so on...

struct ReadLineIterator {
    ifstream ifs;
    string line;

    ReadLineIterator() { }
    ReadLineIterator(string filename) : ifs(filename) { }

    bool operator!=(ReadLineIterator& other) {
        return !ifs.eof();
    }

    ReadLineIterator& operator++() {
        getline(ifs, line, '\n');
        return *this;
    }
    string operator*() {
        return line;
    }
};

struct LineReader
{
    string filename;
    LineReader(const string& filename) : filename(filename) { }

    ReadLineIterator begin()
    {
       return ReadLineIterator(filename);
    }

    ReadLineIterator end() // return a not used dummy iterator since this method must exist
    {
        return ReadLineIterator();
    }
};

When I run this, it works. But I'm skeptical if

bool operator!=(ReadLineIterator& other) {
    return !ifs.eof();
}

is a proper way to make this operator to detect the end of the sequence. This because I don't have any proper end object (the end() method just returns a dummy iterator) and comparisons aren't done against it either. Instead I check if the stream is empty.

But I don't see how I could do this in any other way? For now I'm happy with this way of doing it since it works for me, but it would be great to know if there is better ways of doing the same. Also it would be nice to know if this works with all (C++) compilers (I'm using GCC) and if so that it works with future C++ standards where iterators might be handled differently.

  • As I read the rest of your question let me start out saying, never include `bits/` files: They're non-standard and non-portable among other things. Include the required standard headers instead. – Mark B Jun 28 '17 at 20:47
  • 3
    No, it's not a proper way to implement an iterator since `it != it` will usually return `true`. – Mark Ransom Jun 28 '17 at 20:47
  • 1
    That is not a valid iterator. You have to have the typedefs, and a handful of other operators. Also, as you note, `operator!=` doesn't check the `other` iterator. See [here](https://stackoverflow.com/questions/8054273/how-to-implement-an-stl-style-iterator-and-avoid-common-pitfalls/8054856#8054856) for how to make a proper iterator. Looks like you want an `input_iterator`. – Mooing Duck Jun 28 '17 at 20:47
  • 1
    Required reading: [`std::istream_iterator`](http://en.cppreference.com/w/cpp/iterator/istream_iterator). Especially: "The default-constructed `std::istream_iterator` is known as the end-of-stream iterator. When a valid `std::istream_iterator` reaches the end of the underlying stream, it becomes equal to the end-of-stream iterator." – Mooing Duck Jun 28 '17 at 20:48
  • You should also generally inherit from `std::iterator`, as it will generally do some of the work for you. – Alex Huszagh Jun 28 '17 at 20:49
  • 2
    @AlexanderHuszagh That's actually deprecated in C++17, I'm afraid. See for example https://stackoverflow.com/questions/37031805/preparation-for-stditerator-being-deprecated or https://stackoverflow.com/questions/43268146/why-is-stditerator-deprecated – Bob__ Jun 28 '17 at 21:14
  • @Bob__, well, I guess I'll be getting carpal tunnel sooner than I expected.... But thanks for the info. – Alex Huszagh Jun 28 '17 at 21:17
  • @MarkB Thanks, good to know. Usually I only use stdc++.h when testing C++ stuff in very small one file programs. I will keep this in mind, and include the appropriate headers next time when posting an example like this. – Visitor Iterator Jun 29 '17 at 07:02

2 Answers2

3

I would do this in two parts.

One is a range class that just acts as a wrapper for a stream iterators:

template <class T>
class istream_range {
    std::istream_iterator<T> b;
    std::istream_iterator<T> e;
public:
    istream_range(std::istream &is)
        : b(std::istream_iterator<T>(is))
        , e(std::istream_iterator<T>())
    {}

    std::istream_iterator<T> begin() { return b; }
    std::istream_iterator<T> end() { return e; }
};

So this lets use use istream_iterators in a range-based for loop:

for (auto const &s : istream_range<foo>(myfile))
    // do something with s

An istream_iterator uses operator>> to extract items from the specified file, so the second part is just a tiny type that extracts a line:

class line {
    std::string data;
public:
    friend std::istream &operator>>(std::istream &is, line &l) {
        std::getline(is, l.data);
        return is;
    }
    operator std::string() const { return data; }    
};

So, with this our for loop becomes something like:

for (auto const &s : istream_range<line>(myfile))
    // do something with s

The obvious advantage of this is decoupling the two: we can use istream_range<T> to process a file of T, for any T that normal stream extraction does "the right thing" (including lots of custom extractors of which we can't currently have any awareness).

There are a few more possibilities covered in the answers to a previous question (including a LineInputIterator that seems to be a little closer to what you're asking for).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • It seems that there is missing some closing parentheses in the istream_range initializer list, which I added and then this solution worked and achieved what I wanted. Naturally I have to cast s explicitly to std::string before feeding it into std::cout. **Thank you very much for this explanation.** – Visitor Iterator Jun 29 '17 at 16:38
  • @VisitorIterator: Thanks for the bug report--I believe I've corrected the typos. Glad it's helpful. – Jerry Coffin Jun 29 '17 at 17:39
1

The standard template class std::istream_iterator<T> acts as an iterator that reads successive T objects from an istream (with operator>>(istream &, T &)), so all you need is a type T that reads lines from an istream:

class line {
    std::string line;
    friend std::istream &operator>>(std::istream &in, line &l) {
        return std::getline(in, l.line);
    }
public:
    operator std::string() const { return line; }
};

Now have your LineReader just return a std::istream_iterator<line>.

LogicStuff
  • 19,397
  • 6
  • 54
  • 74
Chris Dodd
  • 119,907
  • 13
  • 134
  • 226