8

Why is the function signature of push_back the following?

void push_back (const value_type& val);

The value passed is copied into the container, why not make a copy directly into the argument list?

void push_back (value_type val);
Filip Roséen - refp
  • 62,493
  • 20
  • 150
  • 196
user1438832
  • 493
  • 3
  • 10
  • 3
    To avoid making yet another copy of the object. – R Sahu May 22 '14 at 06:25
  • What you are talking about is done the right way in `emplace_back`. However, both `push_back` and `emplace_back` are required for different situations and usages. – yzt May 22 '14 at 06:30
  • 2
    Also, when this interface was designed, C++ had no way of *moving* stuff out of destined-to-die objects, which means the interface you are suggesting would have typically resulted in two copies of the value to be made (barring any compiler optimizations.) – yzt May 22 '14 at 06:33
  • Note that in C++11, there is also a function signature `void push_back (value_type&& val);` this allows the compiler to avoid the copy entirely if an [rvalue](http://www.cprogramming.com/c++11/rvalue-references-and-move-semantics-in-c++11.html) is passed in. – Chris Drew May 22 '14 at 06:54
  • See also [When is overloading pass by reference (l-value and r-value) preferred to pass-by-value?](http://stackoverflow.com/q/18303287/341970) – Ali May 22 '14 at 11:31

4 Answers4

2

The answer is to avoid making yet another copy. Take a look at this simple example that illustrates the difference between using value_type and const value_type&.

#include <iostream>
using namespace std;

struct A
{
   A() {}

   A(A const& copy)
   {
      cout <<  "Came to A::A(A const& copy)\n";
   }

   void print() const
   {
      cout << "Came to A:print()\n";
   }

};

void foo(A const& a)
{
   A copy = a;
   copy.print();
}


void bar(A a)
{
   A copy = a;
   copy.print();
}

int main()
{
   A a;
   foo(a);
   bar(a);
}

Output of running the program:

Came to A::A(A const& copy)
Came to A:print()
Came to A::A(A const& copy)
Came to A::A(A const& copy)
Came to A:print()

Notice the additional call to the copy constructor due to the call to bar. For some objects, that additional copy construction and the corresponding destruction can be very expensive when the operation is carried out millions of times.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
  • 2
    That’s just because your code is making unnecessary copies that `std::vector::push_back` does not need to make. I think OP’s contention was that, since the value gets copied into the container anyway, why not pass it by copy to `push_back`, and then *move* it to the container storage. Your answer doesn’t address this (very valid) point. – Konrad Rudolph May 22 '14 at 06:39
  • 1
    Why do `A copy = a;` inside `bar`? Why not use `a` directly? – Chris Drew May 22 '14 at 06:42
  • 1
    @KonradRudolph The *move* operation is not available in C++03. What my functions are doing is precisely what containers such as `std::vector` and `std::list` would do -- make a copy of the object and store the copy. – R Sahu May 22 '14 at 06:43
  • @ChrisDrew Hopefully my previous comment answered your question. – R Sahu May 22 '14 at 06:44
  • @RSahu The `std::vector` container isn’t available prior to C++98. ;-) However, neither pre-C++98 nor C++03 are relevant any more. Your answer is (only) historically correct – but why wasn’t this changed with C++11? The reason for this is much more subtle. – Konrad Rudolph May 22 '14 at 06:46
  • @RSahu: I understand, but I think your example could be confusing. You seem to be encouraging pass by reference-to-const when you need to make a copy internally. The modern advice is to pass-by-value and use the copy. It allows for more optimization by the compiler. For example, no copy is made if an rvalue is passed in. – Chris Drew May 22 '14 at 06:48
  • @KonradRudolph, Would you mind adding an answer to explain what you are thinking of? You are welcome to edit my answer too. – R Sahu May 22 '14 at 06:55
  • @RSahu Actually my reason for commenting rather than answering is that I am not confident in my ability to explain the reason properly. As I’ve said it’s a subtle issue. (Hint, this is a roundabout way for saying “I don’t know”.) – Konrad Rudolph May 22 '14 at 06:57
  • @KonradRudolph. I think the reason is that for optimum performance for all types (that might or might not be copyable and/or movable) it is best to provide both a reference-to-const and an rvalue-reference signature. If you are writing something less generic yourself that might be considered premature optimization and pass-by-value would suffice. – Chris Drew May 22 '14 at 07:09
1

Here's what an extremely simplified push_back into a vector might look like when implemented with each of those interfaces:

// by reference
void push_back (value_type const & val)
{
    // Copy val into its designated place.
    new (m_data_ptr + m_len++) value_type (val);
}

// by value
void push_back (value_type val)
{
    // Copy val into its designated place.
    // In C++11, this copy may not happen if value_type is movable. But that's
    // not always the case. (you have to use std::move too.)
    new (m_data_ptr + m_len++) value_type (val);
}

They look the same, don't they?

The problem is when you try calling them, specially the pass-by-value version:

string s;
...
v.push_back (s);

If push_back accepts its parameter by reference (i.e. value_type & val) then a reference to the existing object s is passed into the function and no copies are made here. Of course, we still need one copy inside the function, but that's kinda necessary.

However, if push_back is written to get its parameter by value (i.e. value_type val) then a copy will be made of the s string right at the call site, onto the stack and into the argument that will be named val. Here, val is not a reference to a string, it is a string and it must come from somewhere. This extra copy is what drove the designer(s) of STL and most sensible C++ libraries to adopt pass-by-reference as the preferred choice for many situations (and in case you are wondering, that const is there to tell the caller that now that this function can modify its precious object, since a reference to it is being given to the function, it won't!)

By the way, this discussion mostly applies to C++98 (i.e. the old C++.) The current C++ has Rvalue references and moving and perfect forwarding which provide for more interface options and the opportunity for cleaner and more precise and more efficient interfaces/implementations, but also make this topic a bit more complicated.

In C++11, there are two overloads of push_back (as well as a new member emplace_back) on vector and other containers.

The push_backs are:

void push_back (value_type const & val);
void push_back (value_type && val);

The second one is the correct version of what you are suggesting (i.e. it won't be ambiguous for the compiler.) It lets the implementation move the value out of that rvalue reference, and lets the compiler generate code to call the faster version if appropriate.

For backward-compatibility reasons (and probably a few other minor ones,) the old push_back signature cannot be removed from C++.

yzt
  • 8,873
  • 1
  • 35
  • 44
1

Storing a moveable and copyable class

Imagine you have this class:

class Data {
 public:
  Data() { }
  Data(const Data& data)            { std::cout << "  copy constructor\n";} 
  Data(Data&& data)                 { std::cout << "  move constructor\n";}
  Data& operator=(const Data& data) { std::cout << "  copy assignment\n"; return *this;}
  Data& operator=(Data&& data)      { std::cout << "  move assignment\n"; return *this;}  
};

Note, a good C++11 compiler should define all these functions for you (Visual Studio doesn't) but I'm defining them here for debug output.

Now, if you wanted to write a class to store one of these classes I might use pass-by-value like you suggest:

class DataStore {
  Data data_;
 public: 
  void setData(Data data) { data_ = std::move(data); }
};

I am taking advantage of C++11 move semantics to move the value to the desired location. I can then use this DataStore like this:

  Data d;   
  DataStore ds;
  
  std::cout << "DataStore test:\n";
  ds.setData(d);
  
  std::cout << "DataStore test with rvalue:\n";
  ds.setData(Data{});
  
  Data d2;
  std::cout << "DataStore test with move:\n";
  ds.setData(std::move(d2));

Which has the following output:

DataStore test:
  copy constructor
  move assignment
DataStore test with rvalue:
  move assignment
DataStore test with move:
  move constructor
  move assignment

Which is fine. I have two moves in the last test which might not be optimum but moves are typically cheap so I can live with that. To make it more optimum we would need to overload the setData function which we will do later but that's probably premature optimization at this point.

Storing an unnmovable class

But now imagine we have a copyable but unmovable class:

class UnmovableData {
 public:
  UnmovableData() { }
  UnmovableData(const UnmovableData& data) { std::cout << "  copy constructor\n";}
  UnmovableData& operator=(const UnmovableData& data) { std::cout << "  copy assignment\n"; return *this;}  
};

Before C++11, all classes were unmovable so expect to find lots of them in the wild today. If I needed to write a class to store this I can't take advantage of move semantics so I would probably write something like this:

class UnmovableDataStore {
  UnmovableData data_;
 public:
  void setData(const UnmovableData& data) { data_ = data; }
};

and pass by reference-to-const. When I use it:

  std::cout << "UnmovableDataStore test:\n";
  UnmovableData umd;
  UnmovableDataStore umds;
  umds.setData(umd);

I get the output:

UnmovableDataStore test:
  copy assignment

with only one copy as you would expect.

Storing an uncopyable class

You could also have a movable but noncopyable class:

class UncopyableData {
 public:
  UncopyableData() { } 
  UncopyableData(UncopyableData&& data) { std::cout << "  move constructor\n";}
  UncopyableData& operator=(UncopyableData&& data) { std::cout << "  move assignment\n"; return *this;}    
};

std::unique_ptr is an example of a movable but noncopyable class. In this case I would probably write a class to store it like this:

class UncopyableDataStore {
  UncopyableData data_;
 public:
  void setData(UncopyableData&& data) { data_ = std::move(data); }
};

where I pass by rvalue reference and use it like this:

  std::cout << "UncopyableDataStore test:\n";
  UncopyableData ucd;
  UncopyableDataStore ucds;
  ucds.setData(std::move(ucd));

with the following output:

UncopyableDataStore test:
  move assignment

and notice we now only have one move which is good.

Generic containers

The STL containers however need to be generic, they need to work with all types of classes and be as optimal as possible. And if you really needed a generic implementation of the data stores above it might look like this:

template<class D>
class GenericDataStore {
  D data_;
 public:
  void setData(const D& data) { data_ = data; }
  void setData(D&& data) { data_ = std::move(data); }   
};

In this way we get the best possible performance whether we are using uncopyable or unmovable classes but we have to have at least two overloads of the setData method which might introduce duplicate code. Usage:

  std::cout << "GenericDataStore<Data> test:\n";
  Data d3;
  GenericDataStore<Data> gds;
  gds.setData(d3);
  
  std::cout << "GenericDataStore<UnmovableData> test:\n";
  UnmovableData umd2;
  GenericDataStore<UnmovableData> gds3;
  gds3.setData(umd2); 
  
  std::cout << "GenericDataStore<UncopyableData> test:\n";
  UncopyableData ucd2;
  GenericDataStore<UncopyableData> gds2;
  gds2.setData(std::move(ucd2));

Output:

GenericDataStore<Data> test:
  copy assignment
GenericDataStore<UnmovableData> test:
  copy assignment
GenericDataStore<UncopyableData> test:
  move assignment

Live demo. Hope that helps.

Community
  • 1
  • 1
Chris Drew
  • 14,926
  • 3
  • 34
  • 54
0

Few reasons:

  • One copy is already being made to the class in question. If you do not have const value_type& val you will force another copy. Passing it by reference (value_type&) helps you do that.
  • This is also telling the compiler that val cannot be modified in any way. This is done so by making it a `const'
  • Of course, once you make a copy you are allowed to modify that but val cannot be modified in any way and the function declaration is guaranteeing that