1

Seems kind of a tough problem because you can't add new member functions to vector. This form avoids the least amount of copies:

std::vector<T>& operator+=(std::vector<T>& lhs, const std::vector<T>& rhs)

But fails for self-assignment, so the only one that works for self-assignment is:

template <typename T>
std::vector<T>& operator+=(std::vector<T>& lhs, std::vector<T> rhs)
{
    lhs.insert(lhs.end(), rhs.begin(), rhs.end());
    return lhs;
}

But this requires an extra copy. What's the correct way of doing this?

It was ambiguous in my question that the above forms "don't work" because they appear to work for ints (although not for std::strings). It was pointed out this is because it's undefined behavior.

2 Answers2

3

The problem with:

template <typename T>
std::vector<T>& operator+=(std::vector<T>& lhs, const std::vector<T>& rhs)
{
    lhs.insert(lhs.end(), rhs.begin(), rhs.end());
    return lhs;
}

is not the signature, it's passing iterators to insert that become invalid before insert has completed.

Just use the correct technique for appending a vector to itself, and no extra copy is needed.

template <typename T>
void concat_in_place(std::vector<T>& lhs, const std::vector<T>& rhs)
{
    auto left_count = lhs.size();
    auto right_count = rhs.size();
    lhs.resize(left_count + right_count);
    std::copy_n(rhs.begin(), right_count, lhs.begin() + left_count);
}
Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • 1
    +1: It's correct and actually answers the question. The only thing left to make this perfect is to put a standard quote about `std::vector::insert` behaviour. – milleniumbug Oct 18 '14 at 19:53
  • So, does adding a call to reserve count as a correct technique? – Marc Glisse Oct 18 '14 at 19:55
  • @MarcGlisse: Please follow the link. – Ben Voigt Oct 18 '14 at 19:58
  • But it becomes fine if you replace `rhs.xxx()` with `addressof(*rhs.xxx())`? – Marc Glisse Oct 18 '14 at 20:37
  • @MarcGlisse: I can't imagine any way for an implementation to break if you use pointers and have called `reserve`, but formally, the Standard says that iterators, pointers, and references all become invalid unless they are before the insertion point. – Ben Voigt Oct 18 '14 at 21:58
  • I don't think the standard says that pointers become invalid. `&v[0]+v.size()` points one past the end of the original vector, and the standard guarantees that the first new element will be inserted exactly there. (I shouldn't have suggested `&*rhs.end()` for that pointer) Anyway, I am reporting it to LWG to clarify that even the iterator version should be fine. – Marc Glisse Oct 18 '14 at 22:15
  • @MarcGlisse: Looking again, you're right, the rule "when there is no reallocation" mentions only iterators and references, not pointers. However, the rules for pointers and references used after the lifetime of an object has ended are the same, so I wouldn't assume pointers remain valid when references do not. – Ben Voigt Oct 18 '14 at 22:20
  • @MarcGlisse: Please note that this issue has been brought to LWG before, I suggest you review that material before submitting a new report. – Ben Voigt Oct 18 '14 at 22:21
  • @BenVoigt I did go through the issue list before and couldn't find anything about it. And Daniel (who maintains the list) suggested I open a new issue... – Marc Glisse Oct 18 '14 at 22:29
  • @BenVoigt The big difference between invalidating reference/iterator or pointer is that in one case they consider the associated object (which indeed changes) and in the other case I am only interested in the position (which cannot change). – Marc Glisse Oct 18 '14 at 22:32
  • @MarcGlisse: References are bound to memory locations, not objects. In that respect they are just like pointers. BTW, issue 242 is reasonably relevant, in that other rules are being clarified to "but operations that invalidate the `end` iterator are still forbidden". This restriction isn't a defect in the Standard; the iterator version is not intended to be allowed. – Ben Voigt Oct 18 '14 at 22:38
  • @MarcGlisse: And this is relevant too: https://groups.google.com/a/isocpp.org/forum/#!topic/std-discussion/dhy23mDFXj4 – Ben Voigt Oct 18 '14 at 22:45
  • For reference, this is now LWG issue 2449. – Marc Glisse Oct 22 '14 at 17:51
0

You should not, as a matter of policy, overload two different std types with an operator between them. It may be undefined behavior: the standard is ambiguous.

If you want operator syntax on std containers, I would recommend named operators. It is also more clear, as operators on a vector could be container operators or element-wise operators, which is why they are missing by default. v +append= v2; is clearly appending. (create a static append object, and overload its lhs and rhs operators with your vector, and use expression templates in the intermediate steps)

// mini named operator library.  Only supports + for now:
template<class Kind>
struct named_operator {};
template<class OP, class LHS> struct plus_ {
  LHS lhs;
  template<class RHS>
  decltype(auto) operator=(RHS&&rhs)&&{
    return plus_assign(std::forward<LHS>(lhs), OP{}, std::forward<RHS>(rhs));
  }
  template<class RHS>
  decltype(auto) operator+(RHS&&rhs)&&{
    return plus(std::forward<LHS>(lhs), OP{}, std::forward<RHS>(rhs));
  }
};
template<class Tag, class LHS>
plus_<Tag,LHS> operator+( LHS&& lhs, named_operator<Tag> ) {
  return {std::forward<LHS>(lhs)};
}

// creating a named operator:
static struct append_tag:named_operator<append_tag> {} append;

// helper function, finds size of containers and arrays:
template<class T,std::size_t N>
constexpr std::size_t size( T(&)[N] ) { return N; }
template<class C>
constexpr auto size(C&& c)->decltype(c.size()) { return c.size(); }

// implement the vector +append= range:
template<class T, class A, class RHS>
std::vector<T,A>& plus_assign(std::vector<T,A>&lhs, append_tag, RHS&& rhs) {
  auto rhs_size = size(rhs);
  lhs.reserve(lhs.size()+rhs_size);
  using std::begin; using std::end;
  copy_n( begin(rhs), rhs_size, back_inserter(lhs) );
  return lhs;
}
// implement container +append+ range:
template<class LHS, class RHS>
LHS plus( LHS lhs, append_tag, RHS&& rhs ) {
  using std::begin; using std::end; using std::back_inserter;
  copy_n( begin(rhs), size(rhs), back_inserter(lhs) );
  return std::move(lhs);
}

live example

Note that std::vector<int> +append= std::list<int> +append+ std::array<double, 3> works with the above code.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • You don't need to test for identity; see the `concat_in_place` code snippet in my answer. It's correct for both self-reference and two independent vectors, and also potentially faster for two independent vectors than the original code. – Ben Voigt Oct 18 '14 at 20:15
  • 1
    @BenVoigt good point. And while I'm at it, I'll write another iteration of my named operator mini library. It is getting pretty compact. While the `static` methods in the tag type is cute, I'm thinking that next time I'll use 3 argument free functions of the type `plus( LHS, append_tag, RHS )` and `plus_assign( LHS, append_tag, RHS )`, as that lets people add new meanings for `append` far away from where the tag is defined. – Yakk - Adam Nevraumont Oct 18 '14 at 21:49
  • @BenVoigt yes, it is cleaner to do the 3 argument `plus` instead of `static` methods. Added to post! – Yakk - Adam Nevraumont Oct 18 '14 at 21:59