3

Let's say I have a vector<string> containing "a" and "b", I wanna copy itself 2 times so that the vector now contains "a", "b", "a", "b", "a", "b"

What is a better approach than using for and push_back?

NSF
  • 2,499
  • 6
  • 31
  • 55

4 Answers4

4

My initial thought:

myvec.reserve(myvec.size()*3);  //reserve not only speeds things upt it also protects us ftom iterator invalidation
vector<string>::iterator it = myvec.end();    //we ant to add two times the origional onto the end so we save the end of the origional end
myvec.insert(myvec.end(), myvec.begin(), it);
myvec.insert(myvec.end(), myvec.begin(), it);

thanks to Emilio Garavaglia for first pointing out problems with this, see here many reasons why this has problems: Does std::vector::insert() invalidate iterators if the vector has enough room (created through reserve)?

Second try:

std::size_t size = myvec.size();
myvec.resize(size*3);  //resize must protects us from iterator invalidation
vector<string>::iterator it = myvec.begin()+size;
std::copy(myvec.begin(),it,it);
std::copy(myvec.begin(),it,it+size);

since no implementation is going to implement a std::string whos default constructor allocates something on the heap this should cause less heap access and therefore be faster than others examples.

Another heap access minimization is to copy the vector into another insert it and then move in the originals, I stole Emilio Garavaglia code and pimped it:

{
vector<string> m = { "a", "b" };
auto n = m; // keep initial value safe
m.reserve(3 * m.size()); // preallocate, to avoid consecutive allocations
m.insert(m.end, n.begin(), n.end());
std::for_each(n.begin(),n.end(),[&n](std::string &in){n.emplace_back(std::move(in));});
}
Community
  • 1
  • 1
odinthenerd
  • 5,422
  • 1
  • 32
  • 61
  • Is auto somthing in C++ 11? – NSF Feb 08 '13 at 21:12
  • @NSF Yes it is, at least in this current usage. – Rapptz Feb 08 '13 at 21:13
  • As per the standard, iterators on a modified vector should always be considered invalid. What you say is true in the most of the implementation, but the standard doesn't set that as a requirement, hence we cannot trust it will work always. – Emilio Garavaglia Feb 08 '13 at 21:14
  • 1
    @EmilioGaravaglia yes I just looked it up and your right... damn gotchas – odinthenerd Feb 08 '13 at 21:15
  • @NSF I removed it to avoid confusion – odinthenerd Feb 08 '13 at 21:15
  • 1
    In my opinion your first try should be ok. To cite [en.cppreference.com](http://en.cppreference.com/w/cpp/container/vector/insert): _If the new size() is greater than capacity(), all iterators and references are invalidated. Otherwise, only the iterators and references **after the added element** are invalidated_. As you add on the end, this shouldn't be a problem. – Christian Ammer Feb 08 '13 at 22:15
  • I was wrong and deleted my answer. When calling `vec.insert(p, i, j)`, then `i`, `j` [must not be iterators into `vec`](http://stackoverflow.com/a/14792174/237483). – Christian Ammer Feb 10 '13 at 10:36
2
vector<string> m = { "a", "b" };

auto n = m; // keep initial value safe

m.reserve(3 * m.size()); // preallocate, to avoid consecutive allocations
m.insert(m.end, n.begin(), n.end());
m.insert(m.end, n.begin(), n.end());
Emilio Garavaglia
  • 20,229
  • 2
  • 46
  • 63
  • save the second insert, swap in the values from n, this will keep you from allocating a bunch of string buffers on the heap, copying contents and then throwing the originals away anyway. – odinthenerd Feb 08 '13 at 21:35
  • +1 for pointing out my error earlier and because I stole and only slightly modified your code for my third example – odinthenerd Feb 11 '13 at 22:03
2

My first thought is the following to avoid problems with invalidating iterators.

{   // note: nested scope
    vector<string> temp(vec); // 1st copy
    temp.insert(temp.end(), vec.begin(), vec.end()); // 2nd copy
    temp.insert(temp.end(), vec.begin(), vec.end()); // 3rd copy
    temp.swap(vec); // swap contents before temp is destroyed
}

On review, I think PorkyBrain and Emilio Garavaglia's answers might make more sense.

Blastfurnace
  • 18,411
  • 56
  • 55
  • 70
  • 1
    Emilio Garavaglia's answer is the same as yours (except for the `reserve` call, which is optional). Yours has the added benefit of giving a stronger exception safety guarantee (if the second `insert` fails your code won't change the original vector) so I give it +1 :) – Alex Feb 08 '13 at 21:26
  • if vector has to grow in the middle of this its going to be slow, reserve is missing. – odinthenerd Feb 08 '13 at 21:34
  • @PorkyBrain: It's amortized constant time so I'd profile to see if it's an actual bottleneck before trying savagely optimized alternatives. – Blastfurnace Feb 08 '13 at 21:39
  • most implementations grow by a factor of 1.5 or 2. so without resize you are going to grow in a size *= 3 more times than without it plain and simple. – odinthenerd Feb 08 '13 at 23:29
1

Here is a simple method:

template<typename T, typename A1, typename A2>
std::vector<T, A1> operator+( std::vector<T, A1> left, std::vector<T, A2> const& right ) {
  left.insert( left.end(), right.begin(), right.end() );
  return left;
}


int main() {
  std::vector<string> m = { "a", "b" );

  m = m + m + m;
}

but as @ChristianAmmer noted, overriding operator+ on a std::vector is ambiguous. And that would be wrong.

So you could go and write an entire infix named operator syntax, and use the magic of C++ to embed it in the C++ language, to get rid of that ambiguity. Sort of like this:

#include <utility>

template<typename Operation, short op>
struct InfixOp {
  Operation* self() { return static_cast<Operation*>(this); }
  Operation const* self() const { return static_cast<Operation const*>(this); }
};

template<typename first_type, typename Infix, short op>
struct PartialForm {
  Infix const* infix;

  first_type a;

  template<typename T>
  PartialForm(T&& first, Infix const* i):infix(i), a(std::forward<T>(first)) {}
};

#define OVERRIDE_OPERATORS(OP, CODE) \
template<\
  typename Left,\
  typename Operation\
>\
PartialForm<typename std::remove_reference<Left>::type, Operation, CODE> operator OP( Left&& left, InfixOp<Operation, CODE> const& op ) {\
  return PartialForm<typename std::remove_reference<Left>::type, Operation, CODE>( std::forward<Left>(left), op.self() );\
}\
\
template<\
  typename Right,\
  typename First,\
  typename Operation\
>\
auto operator OP( PartialForm<First, Operation, CODE>&& left, Right&& right )\
  ->decltype( (*left.infix)( std::move( left.a ), std::forward<Right>(right)) )\
{\
  return (*left.infix)( std::move( left.a ), std::forward<Right>(right) );\
}

OVERRIDE_OPERATORS(+, '+')
OVERRIDE_OPERATORS(*, '*')
OVERRIDE_OPERATORS(%, '%')
OVERRIDE_OPERATORS(^, '^')
OVERRIDE_OPERATORS(/, '/')
OVERRIDE_OPERATORS(==, '=')
OVERRIDE_OPERATORS(<, '<')
OVERRIDE_OPERATORS(>, '>')
OVERRIDE_OPERATORS(&, '&')
OVERRIDE_OPERATORS(|, '|')
//OVERRIDE_OPERATORS(!=, '!=')
//OVERRIDE_OPERATORS(<=, '<=')
//OVERRIDE_OPERATORS(>=, '>=')


template<typename Functor, char... ops>
struct Infix:InfixOp<Infix<Functor, ops...>, ops>...
{
  Functor f;
  template<typename F>
  explicit Infix(F&& f_in):f(std::forward<F>(f_in)) {}
  Infix(Infix<Functor, ops...> const& o):f(o.f) {}
  Infix(Infix<Functor, ops...>&& o):f(std::move(o.f)) {}
  Infix(Infix<Functor, ops...>& o):f(o.f) {}
  template<typename L, typename R>
  auto operator()( L&& left, R&& right ) const
    -> decltype( f(std::forward<L>(left), std::forward<R>(right)))
  {
    return f(std::forward<L>(left), std::forward<R>(right));
  }
};

template<char... ops, typename Functor>
Infix<Functor, ops...> make_infix( Functor&& f )
{
  return Infix<Functor, ops...>(std::forward<Functor>(f));
}

#include <vector>

struct append_vectors {
  template<typename T>
  std::vector<T> operator()(std::vector<T> left, std::vector<T>const& right) const {
    left.insert(left.end(), right.begin(), right.end());
    return std::move(left);
  }
};

struct sum_elements {
  template<typename T>
  std::vector<T> operator()(std::vector<T> left, std::vector<T>const& right) const {
    for(auto it = left.begin(), it2 = right.begin(); it != left.end() && it2 != right.end(); ++it, ++it2) {
      *it = *it + *it2;
    }
    return left;
  }
};
struct prod_elements {
  template<typename T>
  std::vector<T> operator()(std::vector<T> left, std::vector<T>const& right) const {
    for(auto it = left.begin(), it2 = right.begin(); it != left.end() && it2 != right.end(); ++it, ++it2) {
      *it = *it * *it2;
    }
    return left;
  }
};

#include <iostream>

int main() {
  auto append = make_infix<'+'>(append_vectors());
  auto sum = make_infix<'+'>(sum_elements());
  auto prod = make_infix<'*'>(prod_elements());

  std::vector<int> a = {1,2,3};
  a = a +append+ a +append+ a;
  a = a +sum+ a;
  a = a *prod* a;

  std::cout << a.size() << "\n";
  for (auto&& x:a) {
    std::cout << x << ",";
  }
  std::cout << "\n";
}

which has the advantage of clarity at the point where you use it (I mean, a = a +append+ a is pretty clear what it intends to do), at the cost of being a bit tricky to understand how it works, and a bit verbose for such a simple problem.

But at least the ambiguity is gone, which is a good thing, right?

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • Overloading the `operator+=` for `std::vector` is considered [ambiguous](http://stackoverflow.com/q/6366231/237483), this (in my opinion) also applies to `operator+`. – Christian Ammer Feb 08 '13 at 22:36
  • this has the same problem of invalidating iterators as we have been discussing. – odinthenerd Feb 08 '13 at 23:25
  • @PorkyBrain I do not think so. Check the lifetime on `left` more carefully. – Yakk - Adam Nevraumont Feb 09 '13 at 21:21
  • @ChristianAmmer I got rid of the ambiguity. Better? – Yakk - Adam Nevraumont Feb 10 '13 at 03:30
  • Do you think that it is better? Why it must be the operator overloading thing at all? What is wrong with writting a function with a spelling name (but I'm afraid to suggest something)? Why the inventors of the standard library have made a `operator+` for `std::string` but not for `std::vector`? Keep in mind that other's are reading your code and therefore the intention should be clear. – Christian Ammer Feb 10 '13 at 08:01
  • @ChristianAmmer How is `a = a +append+ a;` not clear in intent? :) I will admit that how it works is tricky, but the intent at the use-site is clear as day. You complained it was ambiguous, so I fixed it -- it is no longer ambiguous! Regardless, yes, you can implement this as `a = append(a, a);` instead of using an operator. (Aside: `a +append= a;` and `a +sum= a;` should also be doable using this technique) – Yakk - Adam Nevraumont Feb 10 '13 at 15:07
  • `a +append+ a` looks strange, although I would guess what it does I would look at the implementation details (which are tricky enough and magic, as you self have said). Why the operator overloading thing? In my opinion you should think twice and be carefully with this technic (but that's my opinion, so do as you like). – Christian Ammer Feb 10 '13 at 19:25
  • oh ok I thought you were taking left by reference, in that case, how is this efficient? m cant be moved in because it is not a rvalue so your causing lots of copies right? – odinthenerd Feb 11 '13 at 20:28
  • @PorkyBrain Only one copy is created. See [pass by value](http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/) -- every other invocation of the operator is passed a temporary in the first argument, and hence is implicitly `move`d and/or the copy can be elided (!), and all of the returns are xvalues, so are implicitly `move`d (or elided). The compiler is free to create one copy of `m`, then do two appends on it, then `move` that on top of `m`, by eliding every return and every pass-by-value, and may not create two copies by the standard (without moving). – Yakk - Adam Nevraumont Feb 11 '13 at 20:36
  • Also note that `m = m + (m + m);` would be less efficient. – Yakk - Adam Nevraumont Feb 11 '13 at 20:39
  • ok I get it, so its basically the same thing as blastfurnace is doing with a move at then end rather than a swap and wrapped in an operator template – odinthenerd Feb 11 '13 at 20:51