1

I'm aware that, unless otherwise specified, all standard library functions that accept rvalue reference parameters are guaranteed to leave the moved-from argument in valid but unspecified state, and that some of the examples here may exhibit undefined behavior as a result, but the underlying question does not depend on that.

The following program:

//  testmove1.cpp

#include <iostream>
#include <string>

int main() {
    std::string s1{"String"};
    std::cout << "s1: [" << s1 << "]" << std::endl;
    std::string s2{std::move(s1)};
    std::cout << "s2: [" << s2 << "]" << std::endl;
    std::cout << "s1 after move: [" << s1 << "]" << std::endl;  //  Undefined

    return 0;
}

outputs:

paul@local:~$ ./testmove1
s1: [String]
s2: [String]
s1 after move: []
paul@local:~$ 

Outputting s1 after the move seems undefined to me, but the string being left empty is at least a feasible alternative. Valgrind reports a single allocation being made for this program, which is what you'd expect.

If I do something very similar, but with a class member, I get different results:

//  testmove2.cpp

#include <iostream>
#include <string>

class MyClass {
    std::string m_data;

    public:
        MyClass(const std::string& data) :
            m_data(data) {}
        MyClass(const MyClass&& other) :
            m_data{std::move(other.m_data)} {};
        const std::string& get_data() const { return m_data; }
};

int main() {
    MyClass c1{"Object"};
    std::cout << "c1: [" << c1.get_data() << "]" << std::endl;
    MyClass c2{std::move(c1)};
    std::cout << "c2: [" << c2.get_data() << "]" << std::endl;
    std::cout << "c1 after move: [" << c1.get_data() << "]" << std::endl;

    return 0;
}

with output:

paul@local:~$ ./testmove2
c1: [Object]
c2: [Object]
c1 after move: [Object]
paul@local:~$ 

Not having the second string cleared out also seems like a feasible alternative, so this is not in itself surprising. It is surprising to me that this behavior is different purely as a result of putting the string in a class, however. Valgrind also reports a single allocation being made here.

To test whether they're actually pointing to the same thing, I can change c2 after the move, and check whether c1 is also getting changed:

//  testmove3.cpp

#include <iostream>
#include <string>

class MyClass {
    std::string m_data;

    public:
        MyClass(const std::string& data) :
            m_data(data) {}
        MyClass(const MyClass&& other) :
            m_data{std::move(other.m_data)} {};
        const std::string& get_data() const { return m_data; }
        void change_data() { m_data[0] = 'A'; }
};

int main() {
    MyClass c1{"Object"};
    std::cout << "c1: [" << c1.get_data() << "]" << std::endl;
    MyClass c2{std::move(c1)};
    std::cout << "c2: [" << c2.get_data() << "]" << std::endl;
    std::cout << "c1 after move: [" << c1.get_data() << "]" << std::endl;

    c2.change_data();
    std::cout << "c1 after change: [" << c1.get_data() << "]" << std::endl;
    std::cout << "c2 after change: [" << c2.get_data() << "]" << std::endl;

    return 0;
}

which outputs:

paul@local:~$ ./testmove3
c1: [Object]
c2: [Object]
c1 after move: [Object]
c1 after change: [Object]
c2 after change: [Abject]
paul@local:~$ 

Here, the two objects are clearly not pointing to the same thing, as changing c2 does not affect what's stored in c1. Valgrind now reports 2 allocations, which seems obviously necessary to explain the observed behavior as we clearly have two different strings, but it's not obvious to me why we suddenly get 2 allocations purely as a result of changing one of them. If I get rid of the move altogether, and just create c1 and then call change_data() on it, I only get 1 allocation as you'd expect.

We can get rid of the undefined behavior (barring any other errors on my part) by removing all accesses to c1 after the move:

// testmove4.cpp

#include <iostream>
#include <string>

class MyClass {
    std::string m_data;

    public:
        MyClass(const std::string& data) :
            m_data(data) {}
        MyClass(const MyClass&& other) :
            m_data{std::move(other.m_data)} {};
        const std::string& get_data() const { return m_data; }
        void change_data() { m_data[0] = 'A'; }
};

int main() {
    MyClass c1{"Object"};
    std::cout << "c1: [" << c1.get_data() << "]" << std::endl;
    MyClass c2{std::move(c1)};
    std::cout << "c2: [" << c2.get_data() << "]" << std::endl;

    c2.change_data();
    std::cout << "c2 after change: [" << c2.get_data() << "]" << std::endl;

    return 0;
}

which outputs:

paul@local:~$ ./testmove4
c1: [Object]
c2: [Object]
c2 after change: [Abject]
paul@local:~$ 

We can obviously no longer see the fact that c1 is not changing, since we're not outputting it. But Valgrind still shows 2 allocations.

Anyone know what's going on, here?

  1. Why does a std::string appear to be zeroed out following a move when it's by itself, but not when it's a class member?

  2. In the final example, when I move an object and then change it, why I am getting two allocations rather than one, when I only get one allocation when I move the object and then don't change it? I know we seem to be heading towards quantum computing, but having the Uncertainty Principle at play in C++ seems a little premature.

I'm using g++ 4.7.2, but I get the same observed behavior on clang-503.0.40.

EDIT: Thinking it over, if you end up with two objects in valid states, it does make sense for them both to have allocations. Is this just the compiler optimizing away one of those allocations when it can ascertain that one of them will never be used? An annoying hazard of constructing a minimal example, if it is.

Crowman
  • 25,242
  • 5
  • 48
  • 56
  • 1
    If something is in a valid state, then it is not undefined behaviour to check it. Maybe you mean *unspecified*? – M.M Jun 16 '14 at 00:14
  • @MattMcNabb: You're probably right, simply sending it to `std::cout` probably doesn't involve any functions with preconditions. I did mean undefined, since something like `m_data[0]` would qualify after the move, but what's happening here may not fall into that category. Not really germane to the question in any case. – Crowman Jun 16 '14 at 00:18

2 Answers2

8

I think this is due to:

MyClass(const MyClass&& other) :
        ^^^^^

Since the the object bound to other cannot be changed via this reference, the effect of the intended move operation turns out to just be a copy. If I delete this const then the behaviour changes back to what you expected:

$ g++ -o tm3 tm3.cc -std=c++11 && ./tm3
c1: [Object]
c2: [Object]
c1 after move: []
M.M
  • 138,810
  • 21
  • 208
  • 365
  • Aha. That's a plausible and in-hindsight-obvious answer to the first question. – Crowman Jun 16 '14 at 00:29
  • 1
    I'm not sure if the Standard actually mandates this behaviour (in either the const or non-const case), hopefully someone else can jump in with some standard quotes. – M.M Jun 16 '14 at 00:31
0

ad 1. I suspect that the MyClass c2{std::move(c1)}; calls the default copy constructor as std::move(c1) results in a rvalue-reference and not a const rvalue-reference. The default copy constructor for MyClass then calls the copy constructor for std::string.

If, however the MyClass(const MyClass&& other) was called, then it wouldn't call the move constructor for std::string, as std::move(other.m_data) has type const std::string &&.

The matching rules for the different kinds of references aren't easy to memorize and apply, but a typical pattern for implementing resource transfer is to have a constructor with a non-const rvalue reference as argument, as the resource is intended to be taken away from it.

ad 2. std::string is usually implemented with reference counting and copy-on write (in case multiple references have been counted). This would explain your observations.

  • 1
    copy-on-write `std::string` is [not permitted](http://stackoverflow.com/questions/12199710/legality-of-cow-stdstring-implementation-in-c11) in C++11 (although perhaps non-conforming implementations are still out there) – M.M Jun 16 '14 at 00:31