3

I put the tag language lawyer, although I have the feeling that this is on the wrong side of the standard boundary. I haven't seen a conversation exactly on this point, and but I had at work, so I would like to have some certainty about this.

The issue is accessing (potentially) private fields of virtual base classes. Say I compute the offset of a private field of a class, and then use this offset outside the class to access (read/write) the member variable at this location.

I saw that there is an extension for GCC and clang offsetof (this one is conditionally defined in C++17, what does it mean?), and using it is equivalent to some pointer arithmetic like this:

#include <iostream>

class A
{
    int a{};
public:
    int aa{};
    static ptrdiff_t getAOffset()
    {
        A instance;
        return reinterpret_cast<ptrdiff_t>(static_cast<const void*>(&instance)) - reinterpret_cast<ptrdiff_t>(static_cast<const void*>(&(instance.a)));
        //return offsetof(A, a); // "same" as this call to offset
    }

    int get() const
    {
        return a;
    }
};

class B: public virtual A
{
};

void update_field(char* pointer, ptrdiff_t offset, int value)
{
    int* field = reinterpret_cast<int*>(pointer + offset);
    *field = value;
}

void modify_a(B& instance)
{
    update_field(reinterpret_cast<char*>(dynamic_cast<A*>(&instance)), A::getAOffset(), 1);
}

int main()
{
    B instance;

    std::cout << instance.get() << std::endl;

    modify_a(instance);

    std::cout << instance.get() << std::endl;
}

I also made a coliru (pedantic) that doesn't complain, but still... https://coliru.stacked-crooked.com/a/faecd0b248eff651

Is there something in the standard that authorizes this or is this in undefined behavior land? Happy to see also if there is a difference between the standards.

Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62
  • 1
    Related: https://stackoverflow.com/questions/6433339/does-the-offsetof-macro-from-stddef-h-invoke-undefined-behaviour – πάντα ῥεῖ Nov 29 '18 at 22:50
  • Interesting. OK, so `offsetof` would be tagged as UB by clang UBSAN (if I don't use the macro, or is it the volatile?)... What about computing the offset on a real class? – Matthieu Brucher Nov 29 '18 at 22:51
  • 2
    I get where you are coming from, but to me the important part is remembering that `private` makes it hard, but it can't make it impossible, to get at a variable. Like any wall, put in the effort and you'll get through it. – user4581301 Nov 29 '18 at 22:57
  • Yes, I agree. Some points of view are that breaking encapsulation this way is not correct C++. Then is accessing a variable through a class pointer + offset valid? – Matthieu Brucher Nov 29 '18 at 23:00
  • 2
    I’m probably misquoting, but Bjarne Stroustrup said that access controls protect against mistakes, not malice. – Pete Becker Nov 29 '18 at 23:10
  • 2
    Why does the virtual base class matter here? – geza Nov 29 '18 at 23:14
  • @geza I don't think it will, as there is only instance of `A`. The usual pattern where this occurs is multiple diamond patterns. – Matthieu Brucher Nov 29 '18 at 23:27
  • 4
    If you publish the offset, you might as well just publish a pointer-to-member and make it 100% kosher. – T.C. Nov 29 '18 at 23:30
  • @geza — the virtual base matters because the offset within the `B` object is not fixed. Without the virtual base `offsetof` would give the correct value. – Pete Becker Nov 29 '18 at 23:30
  • 1
    @PeteBecker: Hmm. Maybe I don't see something trivial, but this code calculates offset from `A`. There's a cast (which is unnecessarily a dynamic_cast, as a simple static_cast would do the job), which goes `B` to `A`. So the offset is calculated based on `A`, isn't it? – geza Nov 29 '18 at 23:38
  • @TC well, the offset is actually stored in another class, where we store thousands of other such offsets, and they then used in different ways to get to the fields (of different type, of course). What do you mean by pointer to member is that context? – Matthieu Brucher Nov 30 '18 at 07:49
  • @geza indeed, static_cast is enough in this case, but I thought dynamic was required for multiple virtual inheritance to get to the actual base pointer (imaging that there could be several layers of virtual inheritance). – Matthieu Brucher Nov 30 '18 at 07:51
  • Bumping this again. `offsetof` is conditionally defined in C++17, what does it mean? – Matthieu Brucher Nov 30 '18 at 11:55
  • [Conditionally Supported](http://eel.is/c++draft/intro.defs#defns.cond.supp): Not required to be supported, implementation documents if it does. – Deduplicator Nov 30 '18 at 12:04
  • @MatthieuBrucher: you can use `static_cast` in the case of multiple inheritance as well. It's not a problem using `dynamic_cast`, as the compiler will use `static_cast` under the hood, it's just a little bit misleading. I think T.C. meant that instead of using an offset, you can just use a pointer to member (`int A::*`). – geza Nov 30 '18 at 12:35
  • @geza Oh, OK, that could work as well. I'll investigate this in the future, thanks for the reference. For dyncamic_cast vs static_cast, are you sure this is true for multiple virtual inheritance? The position of the pointer is only known at run time, not compile time. – Matthieu Brucher Nov 30 '18 at 13:07
  • @MatthieuBrucher: yes. In this case, the compiler will generate a little code for `static_cast` as well (fetching the offset from the virtual table). `static_cast` doesn't always mean that "no code generated". – geza Nov 30 '18 at 13:27
  • Thanks for the note :) – Matthieu Brucher Nov 30 '18 at 13:27
  • If you need this for production code, you seemingly have a bigger problem than mere standard compliance. Perhaps you are looking at an instance of the XY problem. – n. m. could be an AI Nov 30 '18 at 13:44
  • By "pointer to member" people normally mean [pointer to member](https://www.google.com/search?q=pointer+to+member). If you are not familiar with the concept, you need to stop whatever you are doing and hit those search results. – n. m. could be an AI Nov 30 '18 at 13:53

0 Answers0