Why std::istream_iterator<> with multiple copy_n() always writes firs value

Question

I tried to copy the input line into multiple vectors:

#include <vector>
#include <sstream>
#include <istream>
#include <iterator>
#include <algorithm>
#include <iostream>

int main(){
  std::vector<int> v1, v2, v3;
  std::istringstream is ("1 2 3 4 5 6");
  std::istream_iterator<int> iit (is);
  std::copy_n(iit, 2, std::back_inserter(v1));
  std::copy_n(iit, 2, std::back_inserter(v2));
  std::copy(iit, std::istream_iterator<int>(), std::back_inserter(v3));
  std::ostream_iterator<int> oit(std::cout, ", ");
  std::copy(v1.begin(),v1.end(), oit);
  std::cout << "\n";
  std::copy(v2.begin(),v2.end(), oit);
  std::cout << "\n";
  std::copy(v3.begin(),v3.end(), oit);
  std::cout << "\n";
  return 0;

}

I assume this porgram output:

1, 2, 
3, 4, 
5, 6,

But I get this:

1, 2, 
1, 3, 
1, 4, 5, 6,

Why copy_n always insert 1 at the beginning of vectors?

If you read [the first sentence of `std::istream_iterator`'s description](https://en.cppreference.com/w/cpp/iterator/istream_iterator), combine it with the fact that the shown code passes the iterator ***by value*** to `std::copy_n`, and `std::copy_n` modifies the iterator, and then the shown code attempts to use the same, original value of the iterator, what conclusion can you draw from this? — Sam Varshavchik, Jul 16 '20 at 18:03
@SamVarshavchik You talking about: "The first object is read when the iterator is constructed"? So I probably have to create a new istream_iterator every time I use a copy? — Fedor Goncharov, Jul 16 '20 at 18:21

Asteroids With Wings · Accepted Answer · 2020-07-16T18:43:56.757

This comes down to a perhaps unintuitive fact of istream_iterator: it doesn't read when you dereference it, but instead when you advance (or construct) it.

(x indicates a read)

Normal forward iterators:

   Data:            1   2   3   (EOF)
   
   Construction
   *it              x
   ++it
   *it                  x
   ++it
   *it                      x
   ++it                         (`it` is now the one-past-the-end iterator)
   Destruction

Stream iterators:

   Data:            1   2   3   (EOF)
   
   Construction     x
   *it
   ++it                 x
   *it
   ++it                     x
   *it
   ++it                         (`it` is now the one-past-the-end iterator)
   Destruction

We still expect the data to be provided to us via *it. So, to make this work, each bit of read data has to be temporarily stored in the iterator itself until we next do *it.

So, when you create iit, it's already pulling the first number out for you, 1. That data is stored in the iterator. The next available data in the stream is 2, which you then pull out using copy_n. In total that's two pieces of information delivered, out of a total of two that you asked for, so the first copy_n is done.

The next time, you're using a copy of iit in the state it was in before the first copy_n. So, although the stream is ready to give you 3, you still have a copy of that 1 "stuck" in your copied stream iterator.

Why do stream iterators work this way? Because you cannot detect EOF on a stream until you've tried and failed to obtain more data. If it didn't work this way, you'd have to do a dereference first to trigger this detection, and then what should the result be if we've reached EOF?

Furthermore, we expect that any dereference operation produces an immediate result; with a container that's a given, but with streams you could otherwise be blocking waiting for data to become available. It makes more logical sense to do this blocking on the construction/increment, instead, so that your iterator is always either valid, or it isn't.

If you sack off the copies, and construct a fresh stream iterator for each copy_n, you should be fine. Though I would generally recommend only using one stream iterator per stream, as that'll avoid anyone having to worry about this.

So when you recommend using one stream iterator per stream. Does that mean create a new iterator every time you use it in copy or anywhere else? — Fedor Goncharov, Jul 16 '20 at 18:40
Actually yes that should be fine too. Let me know how it goes. — Asteroids With Wings, Jul 16 '20 at 18:42
@FedorGoncharov, a stream is like... a stream, a river. Once it's once a chunk of it passes, you will not see it again. — Enlico, Jul 16 '20 at 18:43
@EnricoMariaDeAngelis For some categories of stream ;) But yes it is certainly helpful to think of them as flows of data, rather than as containers. — Asteroids With Wings, Jul 16 '20 at 18:44
@AsteroidsWithWings, I was referring to the only concept I know of as _stream_; what stream is re-iterable? — Enlico, Jul 16 '20 at 18:46
@EnricoMariaDeAngelis `fstream`, `stringstream`, `strstream`, ... — Asteroids With Wings, Jul 16 '20 at 18:46

Why std::istream_iterator<> with multiple copy_n() always writes firs value

1 Answers1