3

From reading this post, it is clear that placement news in c++ are used to call a class constructor on a pre-allocated memory location.

In the case that the memory is already initialized, is a placement new or a reinterpret_cast more appropriate?

For example, let's say I read a raw stream of bytes representing a framed message from a TCP socket. I put this stream into a framesync and retrieve a buffer of a known size that represents my class, which I'll call Message. I know of two ways to proceed.

  1. Create a constructor that takes a flag telling the class not to initialize. Do a placement new on the buffer passing the "don't initialize" flag.

    Message::Message( bool initialize ) 
    {
        //
        // Initialize if requested
        //
        if( initialize )
        {
            Reset( );
        }
    }
    
    void Message::Reset( void )
    {
       m_member1 = 1;
       m_member2 = 2;
    }
    
    Message* message = new ( buffer ) Message( false );
    
  2. Use a reinterpret_cast

    Message* message = reinterpret_cast< Message* > ( buffer ); 
    

I believe that both of these will produce an identical result. Is one preferred over the other as more correct, more OO, safer, easier to read, or better style?

Community
  • 1
  • 1
Hugh White
  • 468
  • 4
  • 7
  • 7
    I find it weird to juxtapose those two operations, as they are completely and fundamentally unrelated. For your particular problem, the first “solution” either won’t work, or will only work with quite considerable effort. – Konrad Rudolph Jul 24 '12 at 16:25
  • Konrad Rudolph - I take it your comment means that #1 is a hack, and you would prefer #2? Cat Plus Plus - is there another way to take untyped, raw data from hardware and turn it into a useable form? – Hugh White Jul 24 '12 at 16:32
  • I'd start with something like: `Message::Message(void *raw_data)`. That might A) copy the data, or B) maintain a pointer to the raw data. I'd avoid the latter though. – Jerry Coffin Jul 24 '12 at 16:33
  • @user1210290 No, #2 is a hack – but sometimes (albeit **very** rarely) justified. #1 plain won’t work. – Konrad Rudolph Jul 24 '12 at 16:34
  • @KonradRudolph, Not so rarely when the windows api becomes involved. That code is littered with mandatory reinterpret casts. They had their reasons way back when it was written, though, I guess. – chris Jul 24 '12 at 16:39
  • @KonradRudolph, my code for #1 may have been unclear. Message::Message( bool initialize ) { // // Initialize if requested // if( initialize ) { Reset( ); } } void Message::Reset( void ) { m_member1 = 1; m_member2 = 2; } Why will this not work? What is the non-hacker way to accomplish #2? – Hugh White Jul 24 '12 at 16:40
  • @user1210290 there is the obvious way: http://ideone.com/bkb7o – R. Martinho Fernandes Jul 24 '12 at 16:41
  • 1
    @chris This *is* pretty rarely though. You essentially only need it for platform interop or when communicating via a byte protocol (but even there it can be avoided, possibly at the cost of performance though). – Konrad Rudolph Jul 24 '12 at 16:42
  • 3
    @user1210290 Read up about how constructors work. Class members will be default-initialised, overwriting the raw memory. You cannot prevent this from within the class (because then it’s already too late) so the `initialize` argument is useless. – Konrad Rudolph Jul 24 '12 at 16:43
  • @KonradRudolph, assuming the class contains only basic types (uint8_t etc), I don't think these values will be initialized. Am I missing something? – Hugh White Jul 24 '12 at 16:51
  • 3
    @user1210290: In that case, they won't be initialised, but there's no requirement to preserve the old values either. A debug build might overwrite the memory before calling the constructor. – Mike Seymour Jul 24 '12 at 16:55

2 Answers2

13

The only meaningful rule is this:

If an instance of some type T has already been constructed at address a, then reinterpret_cast<T*>(a) to get a pointer to the object that already exists.

If an instance of some type T has not yet been constructed at address a, then use placement new to construct an instance of type T at addres a.

They are completely different operations.

The question you need to ask is very, very simple: "does the object already exist?" If yes, you can access it (via a cast). If no, then you need to construct it (via placement new)

The two operations have nothing to do with each others.

It's not a question of which one you should prefer, because they do different things. You should prefer the one which does what you want.

jalf
  • 243,077
  • 51
  • 345
  • 550
  • Thanks, that's a great way to think about it. Do you know of any source for this rule? I ask because I see #1 regularly at my company, but have been criticized for #2. – Hugh White Jul 24 '12 at 17:07
  • 2
    @user1210290: The source of these rules is the language definition. #1 creates a new object, potentially overwriting existing data, which is not what you want to do at all; #2 reinterprets existing data as a different type, which is what you want - as long as it's a standard-layout type whose layout matches the data. – Mike Seymour Jul 24 '12 at 17:11
1

I would say neither.

Using placement new and having a special method of construction seems like a hack. For one thing the standard says that, for example, an int class member that's not initialized has 'indeterminate value' and accessing it 'may' result in undefined behavior. It's not specified that the int will assume the value of the unmodified underlying bytes interpreted as an int. I don't think that there's anything that prevents a conforming implementation from zero initializing the memory before calling the constructor.

For this use of reinterpret_cast to be well defined you have to jump through some hoops, and even then using the resulting object will probably violate strict aliasing rules.

More practically, if you directly send the implementation-specified representation of a class across the network you'll be relying on the the communicating systems having compatible layouts (compatible representations, alignment, etc.).

Instead you should do real serialization and deserialization, for example by using memcpy() and ntoh() to get the data from the buffer into the members of an existing object.

struct Message {
    uint32_t m_member1;
    uint16_t m_member2;
};

extern char *buffer;

Message m;

memcpy(&m.m_member1, buffer, sizeof m.m_member1);
m.m_member1 = ntohl(m.m_member1);
buffer += sizeof m.m_member1;

memcpy(&m.m_member2, buffer, sizeof m.m_member2);
m.m_member2 = ntohs(m.m_member2);
buffer += sizeof m.m_member2;

If you don't just use a preexisting library you'll probably want to wrap this stuff up in a framework of your own.

This way you don't have to deal with alignment issues, the network representation is well defined and can be passed between differing implementations, and the program doesn't use technically undefined behavior.

bames53
  • 86,085
  • 15
  • 179
  • 244