Trying to write string class that can do move semantics from an std::string

Question

I am writing my own string class for really just for learning and cementing some knowledge. I have everything working except I want to have a constructor that uses move semantics with an std::string.

Within my constructor I need to copy and null out the std::string data pointers and other things, it needs to be left in an empty but valid state, without deleting the data the string points to, how do I do this?

So far I have this

class String
{
private:
char* mpData;
unsigned int mLength;
public:
String( std::string&& str)
    :mpData(nullptr), mLength(0)
    {
    // need to copy the memory pointer from std::string to this->mpData

    // need to null out the std::string memory pointer
    //str.clear();  // can't use clear because it deletes the memory
    }
~String()
{
delete[] mpData;
mLength = 0;
}

Related question: http://stackoverflow.com/questions/9957840/move-string-into-vector — Brent Bradburn, Sep 22 '12 at 22:54

score 7 · Accepted Answer · answered Jul 28 '12 at 04:19

7

There is no way to do this. The implementation of std::string is implementation-defined. Every implementation is different.

Further, there is no guarantee that the string will be contained in a dynamically allocated array. Some std::string implementations perform a small string optimization, where small strings are stored inside of the std::string object itself.

answered Jul 28 '12 at 04:19

James McNellis

348,265
75
913
977

So it's impossible? Is it generally a bad idea to make move semantics between classes of different types? How does the stl manage this like within a vector class of strings ? – EddieV223 Jul 28 '12 at 04:23
5

A `vector` only needs to be able to move-construct or move-assign a `string` from another `string`. The `string` class handles this via its move constructor and move assignment operator. Cross-type moving is a bit odd, but it's definitely possible if you control the implementations of both the source and target types or if the source type provides some way to move from it (e.g. `unique_ptr` provides a `release` member function). – James McNellis Jul 28 '12 at 04:25
I really hate summary conclusion "no way." C++ gives you a lot of control and a stateful allocator can at least manage it for the general `std::basic_string` case. It only needs to remember the block allocated (making the fair assumption that one `basic_string` does not own multiple blocks on the heap), and the custom `move`-from only needs to notify it of the ownership change before `clear`ing the `basic_string`. – Potatoswatter Jul 28 '12 at 16:25
2

@Potatoswatter: And now you can't move from arbitary strings anymore. – Xeo Jul 28 '12 at 17:35
@Xeo It's not much of a practical solution (well, it's as bad as any custom allocator but no worse), but it's more constructive to mention what you can do than to leave it at "nope." – Potatoswatter Jul 29 '12 at 02:05

score 1 · Answer 2 · edited May 23 '17 at 11:47

The below implementation accomplishes what was requested, but at some risk.

Notes about this approach:

It uses std::string to manage the allocated memory. In my view, layering the allocation like this is a good idea because it reduces the number of things that a single class is trying to accomplish (but due to the use of a pointer, this class still has potential bugs associated with compiler-generated copy operations).
I did away with the delete operation since that is now performed automatically by the allocation object.
It will invoke so-called undefined behavior if mpData is used to modify the underlying data. It is undefined, as indicated here, because the standard says it is undefined. I wonder, though, if there are real-world implementations for which const char * std::string::data() behaves differently than T * std::vector::data() -- through which such modifications would be perfectly legal. It may be possible that modifications via data() would not be reflected in subsequent accesses to allocation, but based on the discussion in this question, it seems very unlikely that such modifications would result in unpredictable behavior assuming that no further changes are made via the allocation object.
Is it truly optimized for move semantics? That may be implementation defined. It may also depend on the actual value of the incoming string. As I noted in my other answer, the move constructor provides a mechanism for optimization -- but it doesn't guarantee that an optimization will occur.

class String
{
private:
char* mpData;
unsigned int mLength;
std::string allocation;
public:
String( std::string&& str)
    : mpData(const_cast<char*>(str.data())) // cast used to invoke UB
    , mLength(str.length())
    , allocation(std::move(str)) // this is where the magic happens
    {}
};

score -1 · Answer 3 · edited May 23 '17 at 10:34

I am interpreting the question as "can I make the move constructor result in correct behavior" and not "can I make the move constructor optimally fast".

If the question is strictly, "is there a portable way to steal the internal memory from std::string", then the answer is "no, because there is no 'transfer memory ownership' operation provided in the public API".

The following quote from this explanation of move semantics provides a good summary of "move constructors"...

C++0x introduces a new mechanism called "rvalue reference" which, among other things, allows us to detect rvalue arguments via function overloading. All we have to do is write a constructor with an rvalue reference parameter. Inside that constructor we can do anything we want with the source, as long as we leave it in some valid state.

Based on this description, it seems to me that you can implement the "move semantics" constructor (or "move constructor") without being obligated to actually steal the internal data buffers. An example implementation:

String( std::string&& str)
    :mpData(new char[str.length()]), mLength(str.length())
    {
    for ( int i=0; i<mLength; i++ ) mpData[i] = str[i];
    }

As I understand it, the point of move semantics is that you can be more efficient if you want to. Since the incoming object is transient, its contents do not need to be preserved -- so it is legal to steal them, but it is not mandatory. Maybe, there is no point to implementing this if you aren't transferring ownership of some heap-based object, but it seems like it should be legal. Perhaps it is useful as a stepping stone -- you can steal as much as is useful, even if that isn't the entire contents.

By the way, there is a closely related question here in which the same kind of non-standard string is being built and includes a move constructor for std::string. The internals of the class are different however, and it is suggested that std::string may have built-in support for move semantics internally (std::string -> std::string).

If your not moving dynamic memory then just use a reference. Also your code may need a null terminator depending on implementation of the class. String( std::string& str ) — EddieV223, Jul 28 '12 at 05:21
He's trying to do *more* than move semantics allows, not less. — Benjamin Lindley, Jul 28 '12 at 05:44
I found this blog post that agrees with my answer: http://akrzemi1.wordpress.com/2011/08/30/move-constructor-qa — Brent Bradburn, Jul 28 '12 at 15:32
From your blog post: If you just implement a regular copying in your move constructor you do satisfy the two constraints, so it is correct to do so; however, there is really no point in doing so. If you cannot or do not want to offer move constructor that works more efficient than the copy constructor just don’t define it at all. The copy constructor will do in all cases. — EddieV223, Jul 28 '12 at 21:56
@EddieV223: Agreed -- I noted that in my answer. But since this is "just for learning", I thought I would make note of some nuances with my answer. — Brent Bradburn, Sep 22 '12 at 23:27

Trying to write string class that can do move semantics from an std::string

3 Answers3