25

Suppose there's a structure

struct Thing {
  int a;
  bool b;
};

and I get a pointer to member b of that structure, say as parameter of some function:

void some_function (bool * ptr) {
  Thing * thing = /* ?? */;
}

How do I get a pointer to the containing object? Most importantly: Without violating some rule in the standard, that is I want standard defined behaviour, not undefined nor implementation defined behaviour.

As side note: I know that this circumvents type safety.

Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • 2
    it's tricky to deduct your `Thing` pointer from a `bool` address. Why don't you pass the Thing address/pointer to your function? – Stefan Nov 23 '15 at 11:46
  • 1
    @Stefan It's supposed to be part of an allocator: It has nodes with bookkeeping and data, I hand out a pointer to the data on allocation and get that back on deallocation. – Daniel Jour Nov 23 '15 at 11:48
  • I just noticed your comment here about this being in an allocator. That means you *can’t* guarantee it’s standard-layout, and `offsetof` can be undefined. You’ve probably moved on from this since November 2015, but if it’s still relevant and a non-`offsetof` answer turns up, I suggest you implement it. Fortunately, for an allocator, you don’t need to return a pointer. You can return any pointer-like structure which satisfies `NullablePointer` and `RandomAccessIterator` and `typedef` it to `pointer`. – Daniel H Mar 16 '17 at 01:14
  • Just wondering: If you need to get the object from some member - what is the reason that you cannot simply pass the object as pointer instead of the member? – Aconcagua Mar 21 '17 at 12:36

4 Answers4

22

If you are sure that the pointer is really pointing to the member b in the structure, like if someone did

Thing t;
some_function(&t.b);

Then you should be able to use the offsetof macro to get a pointer to the structure:

std::size_t offset = offsetof(Thing, b);
Thing* thing = reinterpret_cast<Thing*>(reinterpret_cast<char*>(ptr) - offset);

Note that if the pointer ptr doesn't actually point to the Thing::b member, then the above code will lead to undefined behavior if you use the pointer thing.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • 1
    Hmm... but what if `sizeof(int8_t) != sizeof(char)` ? – deviantfan Nov 23 '15 at 11:48
  • 1
    @deviantfan Considering that `sizeof(char)` is defined in the specification to *always* result in `1` that won't be much of a problem. :) And that's the actual point of using `int8_t`, it's defined to be a byte which `char` might not be. – Some programmer dude Nov 23 '15 at 11:49
  • 3
    AFAIK `char*` is completely safe and canonical way for this kind of pointer arithmetic. – Karoly Horvath Nov 23 '15 at 11:55
  • 1
    Ah ... because `offsetof` returns the offset *in bytes*. Tricky ^^ – Daniel Jour Nov 23 '15 at 11:55
  • 2
    Ok, got too confused there :) char can't be less than 8bit because minimum range, and if it is more than 8 bit, `int8_t` can't even exist because addresses and sizeof stuff... everything fine. – deviantfan Nov 23 '15 at 11:57
  • 1
    This *does* work with `char*` just as well, right? And that's how most people would write this code, wouldn't they? – Karoly Horvath Nov 23 '15 at 12:14
  • 11
    “ that's the actual point of using `int8_t`, it's defined to be a byte which `char` might not be” — no, you’ve got that completely the wrong way round. Use `char` here, `int8_t` is wrong. – Konrad Rudolph Nov 23 '15 at 13:42
  • 2
    @JoachimPileborg "that's the actual point of using int8_t, it's defined to be a byte which char might not be" - the `offsetof` definition comes from C, which clearly states that a character is a byte, so "offset in bytes" == "offset in chars". If `sizeof(char)` is guaranteed to be 1, then, `char *` is surely the right type to use when dealing with values from `offsetof`? – davmac Jun 03 '16 at 10:02
  • @davmac There are systems with characters that are not 8 bits, and they probably won't have "byte" types like `int8_t` either. However, `sizeof(char)` will always be `1` no matter the underlying actual bit-size of `char`. – Some programmer dude Jun 03 '16 at 10:34
  • @JoachimPileborg exactly; and `offsetof` will on those systems return the offset in `char` units (just as it always does) - so why are you using `int8_t` and not `char`? – davmac Jun 03 '16 at 10:48
  • @davmac Simply to show that I'm dealing with bytes and not characters. I.e. it's a semantic choice, for readers of the code more than the compiler. – Some programmer dude Jun 03 '16 at 10:51
  • 3
    Also note that `offsetof` has undefined behavior for non-standard-layout types (C++11). In practice it often still works as expected. – davmac Jun 03 '16 at 11:53
  • Is there a solution that works even for non-standard-layout types? If not, is there any sort of guarantee at all? – Daniel H Mar 15 '17 at 21:21
  • @DanielH It doesn't matter what layout is used. As long as `offsetof` implemented correctly for this layout. You can't have two different layouts of same structure at same time, can you? – Michael Nastenko Mar 17 '17 at 02:40
  • 9
    The term “standard layout” means, approximately, “something that doesn’t use any features C doesn’t have”. It turns out `offsetof` is actually a C feature, not a C++ one, and the C++ specification explicitly says it is undefined on classes which aren’t standard layout. See also [`std::is_standard_layout`](http://en.cppreference.com/w/cpp/types/is_standard_layout), [`offsetof`](http://en.cppreference.com/w/cpp/types/offsetof), and [the Standard Layout section of the non-static data members page](http://en.cppreference.com/w/cpp/language/data_members#Standard_layout) on cppreference. – Daniel H Mar 17 '17 at 15:33
  • 1
    This answer could be improved by highlighting some of of the risks and caveats of using `offsetof`. Without reading the linked page, and without reading the comments, at face value the question does not indicate that measures must be taken to make sure the behavior is actually defined. Such as making sure the type is a standard layout. Edit : I don't think `reinterpret_cast(ptr) - offset` is allowed. This answer has UB. – François Andrieux Nov 09 '20 at 21:20
6
X* get_ptr(bool* b){
    static typename std::aligned_storage<sizeof(X),alignof(X)>::type buffer;

    X* p=static_cast<X*>(static_cast<void*>(&buffer));
    ptrdiff_t const offset=static_cast<char*>(static_cast<void*>(&p->b))-static_cast<char*>(static_cast<void*>(&buffer));
    return static_cast<X*>(static_cast<void*>(static_cast<char*>(static_cast<void*>(b))-offset));
}

First, we create some static storage that could hold an X. Then we get the address of the X object that could exist in the buffer, and the address of the b element of that object.

Casting back to char*, we can thus get the offset of the bool within the buffer, which we can then use to adjust a pointer to a real bool back to a pointer to the containing X.

Anthony Williams
  • 66,628
  • 14
  • 133
  • 155
  • Do you mean to declare `buffer` as `static typename std::aligned_storage::type`? Also, is there a reason to `static_cast` through `void*` instead of simply `reinterpret_cast`ing directly to `char*` and `X*`? – Daniel H Mar 20 '17 at 18:38
  • Yes, well-spotted. I don't like `reinterpret_cast`, as the mapping is ill-specified. `static_cast` has a defined mapping. – Anthony Williams Mar 20 '17 at 20:22
  • I guess that makes sense in general, although `reinterpret_cast` for pointers is well defined. I don’t like that this reserves the extra space, but otherwise it seems good. I am very slightly worried about the offset not being the same between this `X` and the one for the argument one, but if the compiler does that (I can’t find any evidence it can’t, for non-standard-layout, but I doubt any do), I’m not sure there’s any way to get around it. – Daniel H Mar 20 '17 at 21:53
  • I’m sorry I didn’t get here in time yesterday to assign the full bounty; this still seems like the best solution. Although I do like @rfb’s making a generic function for it, I’m not sure it’s standards-compliant because of alignment issues. – Daniel H Mar 24 '17 at 13:54
  • Sorry but if my answer has value I think it's not for the generic but rather because it brings forward the idea of a built-in displacement mechanism given by the class data member pointer. My solution is in line with the response of @AnthonyWilliams where instead of using one storage it uses the knowledge of such an offset. In fact, the main post from which I drew confirms revolve around the problem of getting the offset from member pointer without temporary instance. – rfb Mar 24 '17 at 15:25
  • I guess with @MichaelNastenko comment on offsetof solution that there aren't alignment issues, excluding you can have two different layouts of same structure at same time. – rfb Mar 24 '17 at 15:42
  • The potential for alignment issues is why I use `aligned_storage`. `T*` is allowed to have fewer bits than `void*`; you may not be able to cast an arbitrary pointer to a `T*` for some `T`. – Anthony Williams Mar 28 '17 at 12:09
5
void some_function (bool * ptr) {
  Thing * thing = (Thing*)(((char*)ptr) - offsetof(Thing,b));
}

I think there is no UB.

deviantfan
  • 11,268
  • 3
  • 32
  • 49
3

My proposal is derived from the @Rod answer in Offset from member pointer without temporary instance, and the similar @0xbadf00d's one in Offset of pointer to member.

I started imagining a form of offset driving the implementation of a pointer to a class data member, later confirmed by the post in question and the tests i've made.

I'm not a C++ practitioner so sorry for the brevity.

#include <iostream>
#include <cstddef>

using namespace std;

struct Thing {
    int a;
    bool b;
};

template<class T, typename U>
std::ptrdiff_t member_offset(U T::* mem)
{
    return 
    ( &reinterpret_cast<const char&>( 
        reinterpret_cast<const T*>( 1 )->*mem ) 
      - reinterpret_cast<const char*>( 1 )      );
}

template<class T, typename U>
T* get_T_from_data_member_pointer (U * ptr, U T::*pU) {
  return reinterpret_cast<T*> (
      reinterpret_cast<char*>(ptr) 
    - member_offset(pU));
}

int main()
{

    Thing thing;
    thing.b = false;

    bool * ptr = &thing.b;
    bool Thing::*pb = &Thing::b;

    std::cout << "Thing object address accessed from Thing test object lvalue; value is: " 
        << &thing << "!\n";     
    std::cout << "Thing object address derived from pointer to class member; value is: " 
        << get_T_from_data_member_pointer(ptr, &Thing::b) << "!\n";    
}
Community
  • 1
  • 1
rfb
  • 1,107
  • 1
  • 7
  • 14
  • Is using `1` as a pointer actually standards-compliant, or could it have alignment issues or something? I actually think you aren’t allowed to convert any `int` to a pointer unless that value was calculated from a limited set of operations *from* a pointer originally, although it might be that you just aren’t allowed to dereference that. Although I did think many times when looking into this that I wish C++ provided by default a `T* operator-(U*, U T::*)` for all types `T` and `U`. – Daniel H Mar 24 '17 at 13:58
  • Or at least that `gcc` provided that as an extension, since I’m pretty sure it actually stores pointer-to-data-member variables as a `ptrdiff_t` (https://refspecs.linuxfoundation.org/cxxabi-1.86.html). – Daniel H Mar 24 '17 at 14:40
  • I don't think could be compliant, like the majority of `reainterpret_cast` issues, if i remember correctly from my old study of C++98. But the conversion is tricky because it's only a way to deduce a displacement, given that the class data member pointer is realized with an offset/displacement. It's just giving the base for compute a pointer arithmetic, nothing to dereference. Correct me if i'm wrong. The @Rod answer is pointing out how that particular convoluted syntax (char reference, at first, and then taking the address) is due to compiler warnings in the line of your observations. – rfb Mar 24 '17 at 15:03