2

this question seems to be quite similar to others around here, but some crucial details are different in this case (they always copy the element of the iterator to the vector as far as I've seen posts).

Sticking with a quite common idiom (as far as I read), I'm doing file -> ifstream -> istream_iterator -> vector (this approach calls >> on type of the vector). The problem I've got with this is that according to the reference, istream_iterator

Dereferencing only returns a copy of the most recently read object

In my case I want to read some objects containing a vector-member, which means that the currently read object is copied to insert it into the vector (which means to copy construct the whole vector-member) and immediately afterwards it is constructed completely new when reading the new object.

So this copy is unneeded and just slowing down the whole thing. Is there a way to eliminate this uneeded copy (I imagine, the istream_iterator might just be moveing the object out and construct a new one afterwards, but maybe there is another way to avoid this copying)?

For illustration of the problem see some example code:

#include <fstream>
#include <iostream>
#include <iterator>
#include <vector>

struct My_class {
  std::vector<std::string> vec;

  // default/copy/move constructors to illustrate what constructors are being invoked
  My_class() : vec{} {
      std::cout << " default" << std::endl;
  }
  My_class(My_class const &other) : vec{other.vec} {
      std::cout << " copy" << std::endl;
  }
  My_class(My_class &&other) noexcept : vec{std::move(other.vec)} {
      std::cout << " move" << std::endl;
  }


  friend std::istream& operator>>(std::istream& is, my_class& i) {
    i.vec.clear(); // needed since i still contains the data from the previous object read
    std::string l;
    // in other use cases some more lines would be read or maybe the lines get preprocessed (e.g. split it up) and then inserted into the vector
    std::getline(is, l);
    i.vec.push_back(l);
    return is;
  }
};

int main(int argc, char* argv[]) {
  std::ifstream ifs{argv[1]};
  std::vector<my_class> data{std::istream_iterator<my_class>{ifs}, {}};

  for(const auto& _ele : data){
      for(const auto& ele : _ele.vec){
          std::cout << ele << " ";
      }
      std::cout << std::endl;
  }
}

Second try:

#include <fstream>
#include <iostream>
#include <iterator>
#include <vector>
#include <memory>
#include <ranges>

static int cpy_cnt = 0;

struct My_class {
    std::vector<std::string> vec;

  // default constructor
  My_class() : vec{} {
    std::cout << vec.size() << " default" << std::endl;
  }

  My_class(My_class&&)      = default;
  My_class(const My_class&) = default;

  My_class& operator=(My_class&&) = default;
  My_class& operator=(My_class&)  = default;

  friend std::istream& operator>>(std::istream& is, My_class& i) {
    i.vec.clear();
    std::string l;
    std::getline(is, l);
    i.vec.push_back(l);
    return is;
  }
};

int main(int argc, char* argv[]) {
  std::ifstream ifs{argv[1]};
  std::cout << "iter create" << std::endl;
  auto tmp = std::views::istream<My_class>(ifs);
  std::cout << "vec create" << std::endl;
  std::vector<My_class> data2{};
}

gives an error containing error: no matching function for call to '__begin' (not gonna paste the complete long error message of clang).

I've been fiddling around with gcc and clang and it seams, that the error only occurs with clang while with gcc everything is just fine.

Configuring with cd build && cmake -DCMAKE_BUILD_TYPE=Debug -D CMAKE_C_COMPILER=clang -D CMAKE_CXX_COMPILER=clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=1 .. and cd build && cmake -DCMAKE_BUILD_TYPE=Debug -D CMAKE_C_COMPILER=gcc -D CMAKE_CXX_COMPILER=g++ -DCMAKE_EXPORT_COMPILE_COMMANDS=1 ..

$ g++ --version
g++ (GCC) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ clang++ --version
clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
atticus
  • 138
  • 15

1 Answers1

1

Like you have found out, that derefencing istream_iterator will give you a copy. One workaround for you is to use views::istream and ranges::move:

std::ifstream ifs{argv[1]};
auto ifs_view = std::views::istream<My_class>(ifs);

std::vector<My_class> data;
std::ranges::move(ifs_view, std::back_inserter(data));

Depends on your use case, you can also loop on ifs_view directly to avoid any constructions(since your operator >> insert data directly to the underlying vector):

for(const auto& _ele : ifs_view) { 
   // for loop logics here...
}

Sidenote, istream_view requires the object to be assignable, and your My_class's assigning operators are implicitly deleted right now, so you would need to either provide at least one yourself, or just default the copy/move constructors.

Ranoiaetep
  • 5,872
  • 1
  • 14
  • 39
  • Based from reading the docs and your explanation, this looks pretty nice. It's always about knowing the right (standard)library stuff ^^. Thanks foe pointing out. Should even be possible to construct the vector by iterator on `views::istream`. Gonna check this out tomorrow. – atticus Dec 03 '22 at 18:05
  • Just tried this, but I've got an error left `error: no matching function for call to '__begin'` which originates from `auto tmp = std::views::istream(ifs);`. Any Idea how to solve this? (I already defaulted copy/move constructors and copy/move assignment) – atticus Dec 04 '22 at 13:42
  • 1
    @atticus I would need some more detail to tell the problem. Would you like to share your code, or share the entire error message? Include the test input if possible. – Ranoiaetep Dec 04 '22 at 21:10
  • Added it to the original question (there I can add huge codeblocks much nicer). Seems to be a `gcc` vs `clang` thing. Nevertheless, maybe you've got an idea how to get it working on both (if that's possible), or you can try if it works with your `clang` version. – atticus Dec 04 '22 at 22:01
  • 1
    @atticus I don't have clang installed right now, but based on the error message shown here: https://godbolt.org/z/dsrMoaMvb, it seems to be faulty implementation by clang since `stream_view` should not require `contiguous_iterator`. Also using trunk version of clang does compile, so it seems to be corrected in future release. – Ranoiaetep Dec 04 '22 at 22:16
  • Sorry for bothering you one last time. I see no only moves take place, no copy. But I also see no additional (default) constructors are being called. Now I wonder, can I still rely on having a "valid" object `i` in the `operator>>` function? (Normally after doing the move, the source object is in an unspecified-state, which means just doing the `i.vec.push_back()` might be a bad idea, right? So in effect I'd need to kinda reset all members of the struct in the `operator>>` function, before working on it, right?). – atticus Dec 05 '22 at 00:51
  • 1
    @atticus *'can I still rely on having a "valid" object `i`'* -- Yes. The only thing that is unspecified is whether the original vector will be empty, the vector itself is still valid. You can see more discussion about it here: [Is a moved-from vector always empty?](https://stackoverflow.com/questions/17730689/is-a-moved-from-vector-always-empty). Either way, you are manually calling `vector::clear` in `operator>>` anyways, so you don't need to worry about whether it is empty. – Ranoiaetep Dec 05 '22 at 04:21