2

I want to find a way to encapsulate a header-only 3rd party library without exposing its header files. In our other projects, we encapsulate by using void*: in the implementation, we allocate memory and assign to it, and cast to pointer of its original type when we use it. But this time, the encapsulated class is used frequently, hence dynamic allocation is unacceptable. Here is another solution I'm currently considering.

Assuming that the encapsulated class need N bytes, I will make a char array member variable of size N in the wrapper class, named data, for instance. In the implementation, when I try to assign an object of the encapsulated class to the wrapper, or forward a function call, I need to cast &data to the pointer of encapsulated class by reinterpret_cast, firstly. The char array is completely a placeholder. To make this clear, here is a sample code.

#include <iostream>

struct Inner {
    void print() const {
        std::cout << "Inner::print()\n";
    }
};

struct Wrapper;

Inner* addressof(Wrapper&);
const Inner* addressof(const Wrapper&);

struct Wrapper {
    Wrapper() {
        Inner* ptr = addressof(*this);
        *ptr = Inner();
    }

    void run() const {
        addressof(*this)->print();
    }
    
    char data[1];

};

Inner* addressof(Wrapper& w) {
    return reinterpret_cast<Inner*>(&(w.data));
}

const Inner* addressof(const Wrapper& w) {
    return reinterpret_cast<const Inner*>(&(w.data));
}

int main() {
    Wrapper wrp;
    wrp.run();
}

From the view of memory, this seems make sense. But I'm sure if this is some kind of undefined behaviour.

Additionally, I want to know if there is a list of undefined behaviour. Seems like cppreference doesn't contain such thing and C++ standard specfication is really hard to understand.

  • This might be made to work, but something you'll need to be careful of is to ensure your wrapper type's alignment is at least that of the wrapped type. By "made to work" I mean even if it's formally undefined it may still work. Many undefined things can be made to work in particular circumstances, you just need to be careful of what those circumstances are. – SoronelHaetir Nov 22 '22 at 05:58
  • 1
    I believe this is indeed [UB](https://en.cppreference.com/w/cpp/language/ub). There are *very few* things you can safely do with [reinterpret_cast](https://en.cppreference.com/w/cpp/language/reinterpret_cast) and I don't see how this qualifies. – Jesper Juhl Nov 22 '22 at 06:10
  • @JesperJuhl I noticed this [question](https://stackoverflow.com/questions/573294/when-to-use-reinterpret-cast). I think the circumstance of the second answer seems similar to mine. Both cast one pointer type to another. Could you tell me the difference? – SynchronizX Nov 22 '22 at 06:25
  • @SynchronizX I agree that looks like your situation, but I just don't see how that falls within the allowed uses of `reinterpret_cast` still. But I may be wrong. – Jesper Juhl Nov 22 '22 at 06:59

1 Answers1

1

What you have here is undefined behavior. The reason is when you reinterpret an object to a different type, you are not allowed to modify it until you cast it back to the original type.


In your code, you originally have the data as a char[1]. Later, in your constructor, you reinterpret_cast &data as Inner*. At this point, modifying the its value will produce undefined behavior.

What you could do however, is to first create a Inner object, then cast it and store it in the char[1]. Later you can cast the char[1] back to the Inner object and do anything with the Inner object as wanted.

So now your constructor would look like this:

Wrapper() {
    Inner inner;
    char* ptr = reinterpret_cast<char*>(&inner);
    std::memcpy(data, ptr, 1);
}

However, if you did it like this, then you don't even need the reinterpret_cast there as you can directly memcpy from inner:

Wrapper() {
    Inner inner;
    std::memcpy(data, &inner, 1);
}

Better, if you have C++20, then you can and should use std::bit_cast, along with std::byte(C++17) and std::array(C++11):

struct Wrapper {
    Wrapper()
    : data(std::bit_cast<decltype(data)>(Inner{}))
    {}

    void run() const {
        std::bit_cast<Inner>(data).print();
    }
    
    std::array<std::byte, 1> data;
};

Demo: https://godbolt.org/z/MaT5sasaT

Ranoiaetep
  • 5,872
  • 1
  • 14
  • 39
  • "What you have here is undefined behavior. The reason is when you reinterpret an object to a different type, you are not allowed to modify it until you cast it back to the original type." - does that mean I can always reinterpret cast, as long as I don't modify? – Raildex Nov 22 '22 at 08:14
  • @Raildex Unless you reinterpret_cast an object to something similar to char[], you are not allowed to examine them either. The only guarantee is that when you cast them back to the original type, you will get the original address. (This assumes you were casting to a different type, there are a few exceptions where you are allowed to modify the result, more can be found: [*reinterpret_cast*](https://en.cppreference.com/w/cpp/language/reinterpret_cast)) – Ranoiaetep Nov 22 '22 at 08:35
  • Thanks for your solution, but I notice that both `memcpy` and `bit_cast` need the type to be trivially copyable. Is there a solution for non-trivially copyable type? – SynchronizX Nov 23 '22 at 01:38
  • 1
    @SynchronizX Note `memcpy` is allowed to be used with non-trivially copyable types, but it could potentially lead to undefined behaviors. Does `Inner` manually allocate additional memories? If so, that additional memory will be deallocated at the end of the `Wrapper()`, which will lead to undefined behaviors. You can potentially transfer the ownership of those memory manually however. – Ranoiaetep Nov 23 '22 at 03:35
  • @SynchronizX On the other hand, there are other ways to hide header usage from the user, such as PIMPL pattern and C++20 Modules – Ranoiaetep Nov 23 '22 at 03:37
  • @SynchronizX Here's more about [Hide contents of third-party C++ header file](https://stackoverflow.com/questions/13903280/hide-contents-of-third-party-c-header-file) – Ranoiaetep Nov 23 '22 at 03:39
  • @Ranoiaetep sorry to bother you again, but I have another doubt. If the key of not being undefined behaviour is the object should be created in its original type and casted to intermediate type (type of placeholder), then placement new could be another acceptable solution, which means I can create inner object by `new (data) Inner;`. Is this correct? – SynchronizX Nov 30 '22 at 08:07