6

Is there any difference between the following three casts for extracting raw byte pointers for use in pointer arithmetic? (assume a platform where char is 1 byte.)

  1. static_cast<char*>((void*)ptr))
  2. reinterpret_cast<char*>(ptr)
  3. (updated) or: static_cast<char*>(static_cast<void*>(ptr))

Which should I prefer?

In more detail...

Given pointers to two member objects in a class, I would like to compute an offset from one to the other, so that I can reconstruct the address of one member given an offset and the address of the other member.

// assumed data layout:
struct C {
  // ...
  A a;
  // ...
  B b;
}

The code that I use at the moment is along the lines of:

void approach1( A *pa, B *pb )
{
  // compute offset:
  std::ptrdiff_t offset = static_cast<char*>((void*)pa) - static_cast<char*>((void*)pb);
  // then in some other function...
  // given offset and ptr to b, compute ptr to a:
  A *a = static_cast<A*>( (void*)(static_cast<char*>((void*)pb) + offset) );
}

main()
{
  C c;
  approach1(&c.a, &c.b);
}

I would like to know whether the following is better (or worse):

void approach2( A *pa, B *pb )
{
  std::ptrdiff_t offset = reinterpret_cast<char*>(pa) - reinterpret_cast<char*>(pb);
  // ...
  A *a = reinterpret_cast<A*>( reinterpret_cast<char*>(pb) + offset );
}

Are the two methods entirely equivalent? Are they equally portable?

My impression is that approach1() is more portable, because "static_casting a pointer to and from void* preserves the address," whereas reinterpret_cast<> guarantees less (see accepted answer at link).

I would like to know what the cleanest way to do this is.

Update: Explanation of Purpose

A number of people have asked what is the purpose of computing these offsets. The purpose is to construct a meta-class table of instance offsets. This is used by a runtime reflection mechanism for automatic GUI building and persistance (the offsets are not serialized, just used to traverse the structure). The code has been in production for over 15 years. For the purposes of this question I just want to know the most portable way of computing the pointer offsets. I have no intention of making large changes to the way the metaclass system works. In addition, I'm also generally interested in the best way to do this, as I have other uses in mind (e.g. difference pointers for shared memory code).

NOTE: I can not use offsetof() because in my actual code I only have the pointers to instances a and b, I don't necessarily have the type of the containing object c or other static info to use offsetof(). All I can assume is that a and b are members of the same object.

Community
  • 1
  • 1
Ross Bencina
  • 3,822
  • 1
  • 19
  • 33
  • c++11 or not? some specifics of reinterpret_cast have changed IIRC. Apart from that: why do you want to do this? – stijn Sep 03 '14 at 07:43
  • This might help a little http://stackoverflow.com/questions/332030/when-should-static-cast-dynamic-cast-const-cast-and-reinterpret-cast-be-used – Stefan Falk Sep 03 '14 at 07:46
  • @stijn C++03 or earlier preferably. Answers that explain the differences between different C++ versions would be appreciated so I can understand. I'll update the question with a description of purpose. – Ross Bencina Sep 03 '14 at 07:47
  • @StefanFalk thanks Stefan, I guess that favours static_cast(static_cast(p)) over reinterpret_cast() – Ross Bencina Sep 03 '14 at 07:54
  • I would assume that the two cast expressions always have the same result. The C cast to void* in the first expression won't change the value and the subsequent static cast to char* shouldn't change it either, since there is no type information in the void pointer. A reinterpret cast (as in the second expression) shouldn't change the value to begin with, unless I'm missing a subtlety (@stijn?). That's why I'd expect the results of both expressions to be equal for all pointers to objects. Can somebody produce counter examples? – Peter - Reinstate Monica Sep 03 '14 at 08:32
  • @PeterSchneider, sorry to add to the confusion, but do you think static_cast((void*)p) is equivalent to static_cast(static_cast(p)), or should I add the double static cast as a third option? – Ross Bencina Sep 03 '14 at 08:36
  • The C style cast to void* will usually result in a static_cast (which can cast pointers to and from void*), so that the two expressions are equivalent. Having the static cast explicit would be my preference although it looks decidedly ugly. One could make it a function.- I said "usually" because the C style cast may, as we know, result in a reinterpret cast if necessary; not sure whether such a scenario is conceivable (incomplete type of p?). – Peter - Reinstate Monica Sep 03 '14 at 09:08
  • @PeterSchneider regarding reinterpret_cast: pre-C++11 the value returned by reinterpret_cast is undefined and basically shouln't be used for anything else than doing another reinterpret_cast for the same type. That is not exactly what happens here. See http://stackoverflow.com/questions/573294/when-to-use-reinterpret-cast – stijn Sep 03 '14 at 09:14
  • Another thing is that nominally the result of a reinterpret cast is not required to preserve the bit pattern of the argument (only a cast back to the original type is required to be equal provided the intermittent type was large enough to hold the pointer value). Cf. http://stackoverflow.com/questions/573294/when-to-use-reinterpret-cast. That makes me lean towards the double static cast which should also catch unintended use cases where the reinterpret cast (perhaps implicitly by means of a C cast) would mis-interpret funny types. – Peter - Reinstate Monica Sep 03 '14 at 09:15
  • 1
    @stijn found the same post :-). I didn't personally check the standards yet (all 3??) but would trust that post. The question is largely academical though -- some exotic platforms may change the bit pattern (there usually is a reason for unexpected freedoms the standard grants) but in everyday life I bet a cent to the dollar that the reinterpret cast translates to a NOP (the "unsurprising result" the standard intends), and a quick test for any given platform will verify that. Users of exotic machines, on the other hand, usually know what they are doing. – Peter - Reinstate Monica Sep 03 '14 at 09:25

2 Answers2

6

These two will lead to the same result so the difference is mostly semantical, and reinterpret_cast has exactly the meaning of the operation you want, plus the fact that only one cast is required instead of two (and the less cast you have in your code the better).

reinterpret_cast

5.2.10/7: An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast< cv T* >(static_cast< cv void* >(v)).

So except if an exotique random low level different behaviour appears on a middle-age platform, you should definitely go with:

reinterpret_cast<char*>(ptr);

In general.

That said, why don't you use uintptr_t in your case ? it's even more apropriate, you need no pointer:

void approach3( A *pa, B *pb )
{
  std::ptrdiff_t offset = reinterpret_cast<std::uintptr_t>(pa) - reinterpret_cast<std::uintptr_t>(pb);
  // ...
  A *a = reinterpret_cast<A*>( reinterpret_cast<std::uintptr_t>(pb) + offset );
}

For additional information see:

http://en.cppreference.com/w/cpp/language/reinterpret_cast

Drax
  • 12,682
  • 7
  • 45
  • 85
  • regarding `uintptr_t`: I'd say the difference between two pointers is best computed on the pointers, not on integers. – Ross Bencina Sep 03 '14 at 09:09
  • 3
    @RossBencina i'd say the exact opposite :) You're computing an offset not a number of elements, and that's why you are casting to `char*` in the first place, because `char` is one byte in size. But you don't want the number of characters between your pointers, you want the bytes offset which is more naturally represented by numbers than by pointers. Pointer difference means number of elements between them. – Drax Sep 03 '14 at 09:11
  • if I replaced `char*` with `byte_t`, where `sizeof(byte_t) == 1` would you be happier? Also, see Peter Schneider's comment on the question about reinterpret_cast not guaranteeing bit pattern -- I think this breaks your `uintptr_t` approach. – Ross Bencina Sep 03 '14 at 09:28
  • @RossBencina `C++ 5.2.10.4: The mapping function is implementation-defined. [Note: It is intended to be unsurprising to those who know the addressing structure of the underlying machine. — end note]` Actually not preserving the bit pattern is _most likely_ a freedom in order so this can work on platforms where addresses do not have the same representation as integers. But i agree that there is no perfect guarantee, although there is no other sane thing to do :) – Drax Sep 03 '14 at 10:21
  • 1
    Also see this answer for a random horrible case : http://stackoverflow.com/a/22643724/1147772 :D – Drax Sep 03 '14 at 11:02
  • I like the last paragraph of that "But you're better off using pointer arithmetic directly" :) – Ross Bencina Sep 03 '14 at 12:37
  • That is, if you consider exotic platforms to be a potential target for your software :) – Drax Sep 03 '14 at 12:45
0

I do not recommend calculating offset distances between class members' addresses. Either the compiler might inject padding data, or even if it is working it will work the same way only for that specific compiler running on that specific host. There are a multitude sources of error when applying this practice. For example what if you have to deal with the famous Virtual tables and memory layout in multiple virtual inheritance ? This will totally render your solution unusable.

So back to the roots: Why are you trying to do this? Maybe there is a better solution.

EDIT/Update

Thanks for explaining us the reason. It is a very interesting approach I did not see till now. I have learned something today.

However, I still stick to my point that there should be a much more easier way of handling this. And just as a concept of proof, I wrote a small application just to see which of your methods is working. For me neither of them work.

The application is a slightly expanded one of your methods, here it is:

#include <iostream>
#include <stdio.h>
#include <string>

struct A
{
    A(const std::string& pa) : a(pa) {printf("CTR: A address: %p\n", this) ;}
    std::string a;
};

struct B
{
    B(const std::string& pb) : b(pb) {printf("CTR: B address: %p\n", this) ;}
    std::string b;
};

// assumed data layout:
struct C {

    C() : a("astring"), b("bstring") {}
  // ...
  A a;
  // ...
  B b;
};

void approach1( A *pa, B *pb )
{

    printf("approach1: A address: %p B address: %p\n", pa, pb); 
    // compute offset:
    std::ptrdiff_t offset = static_cast<char*>((void*)pb) - static_cast<char*>((void*)pa);
    // then in some other function...
    // given offset and ptr to b, compute ptr to a:
    A *a = static_cast<A*>( (void*)(static_cast<char*>((void*)pb) + offset) );
    printf("approach1: a address: %p \n", a); 

    std::cout << "approach1: A->a=" << a->a << std::endl;
}


void approach2( A *pa, B *pb )
{
    printf("approach2: A address: %p B address: %p\n", pa, pb); 

    std::ptrdiff_t offset = reinterpret_cast<char*>(pb) - reinterpret_cast<char*>(pa);

    A *a = reinterpret_cast<A*>( reinterpret_cast<char*>(pb) + offset );
    printf("approach2: a address: %p \n", a); 
    std::cout << "approach2: A->a=" << a->a << std::endl;
}

main()
{
  C c;
  std::cout << c.a.a << std::endl;

  approach1(&c.a, &c.b);
  approach2(&c.a, &c.b);
}

The output of it on my computer (uname -a Linux flood 3.13.0-33-generic #58-Ubuntu SMP Tue Jul 29 16:45:05 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux) with my compiler (g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2) is:

CTR: A address: 0x7fff249f0900
CTR: B address: 0x7fff249f0908
astring
approach1: A address: 0x7fff249f0900 B address: 0x7fff249f0908
approach1: a address: 0x7fff249f0910 
approach1: A->a=<GARBAGE>
approach2: a address: 0x7fff249f0910 

where <GARBAGE> as expected contains ... garbage.

Please see at: http://ideone.com/U8ahAL

Community
  • 1
  • 1
Ferenc Deak
  • 34,348
  • 17
  • 99
  • 167
  • 2
    I have updated the question with a description of the purpose. Unfortunately you have not answered my question, just told me that I'm doing it wrong -- that would have been better done in a comment. I am aware of the hazards. – Ross Bencina Sep 03 '14 at 07:56
  • @fritzone I think the code should work, it's just that the logic `pb - pa` is wrong as @RossBencina says. Fix the logic and your examples shows it works well. – Mine Sep 03 '14 at 08:54
  • @Mine and RossBencina Agree with both of you. I am trying to crucify the working code with all kind of weird inheritance and virtual stuff, but still works :) – Ferenc Deak Sep 03 '14 at 08:56
  • 1
    most programmers with such a specific requirement as doing address arithmetic on members are probably using standard-layout objects, and possibly even layouts with known or `pragma`d packing, that makes this kinda thing far more (if not theoretically completely) portable. i've certainly never wanted to do any low-level bytewise stuff on any virtual object. but it sounds like you got it working even then. can you edit your post to reflect what "the working code" means, since you left it by saying it _didn't_ work? – underscore_d Jan 07 '16 at 19:15