9

My mental model for a reinterpret_cast has always been, to treat the sequence of bits of an expression as if they were of a different type, and cppreference (note: this is not a quote from the C++ Standard) seems to agree with that:

Unlike static_cast, but like const_cast, the reinterpret_cast expression does not compile to any CPU instructions. It is purely a compiler directive which instructs the compiler to treat the sequence of bits (object representation) of expression as if it had the type new_type.

Looking for guarantees, I stumbled across a note under [expr.reinterpret.cast]:

[ Note: The mapping performed by reinterpret_­cast might, or might not, produce a representation different from the original value. — end note ]

That left me wondering: Under which conditions does a reinterpret_cast produce a value with object representation different from the original value?

IInspectable
  • 46,945
  • 8
  • 85
  • 181
  • 1
    No casts change the object they are applied to, if that is what you are asking. –  Aug 25 '18 at 21:34
  • @NeilButterworth: I don't feel, that edit adds anything that wasn't there already. – IInspectable Aug 25 '18 at 22:03
  • 1
    Immediately comes to mind a situation when different pointer types have different alignment requirements and actively zero out the low-order bits. On such platform casting from a less strict alignment requirement to a more strict alignment requirement will zero out some bits. – AnT stands with Russia Aug 25 '18 at 22:31
  • @AnT: That helped me understand, why `reinterpret_cast` round-tripping (`T1*` → `T2*` → `T1*`) is only guaranteed to return the same object representation, in case `T2`'s alignment requirements are no stricter than `T1`'s. Thanks for that. – IInspectable Aug 27 '18 at 19:52

2 Answers2

9

Here's an example: if you read the 4th bullet point:

A pointer can be explicitly converted to any integral type large enough to hold all values of its type. The mapping function is implementation-defined. [ Note: It is intended to be unsurprising to those who know the addressing structure of the underlying machine. — end note ]

Now, it is implementation defined, what value of i will have here:

void *ptr = <some valid pointer value>;
uintptr_t i = reinterpret_cast<uintptr_t>(ptr);

It can be anything, provided that reinterpret_casting i back we'll get ptr.

The representation of ptr and i could differ. The standard just says that the value of i should be "unsurprising". Even, if we reinterpret_cast ptr to a wider integer (for example, if a pointer is 32-bit, casting to unsigned long long int), the representation must differ, because the size of the variables differ.

So I think that cppreference description is misleading, because there can be reinterpret_casts, which actually need CPU instructions.


Here is another case (found by IInspectable), a comment by Keith Thompson:

The C compiler for Cray vector machines, such as the T90, do something similar. Hardware addresses are 8 bytes, and point to 8-byte words. void* and char* are handled in software, and are augmented with a 3-bit offset within the word -- but since there isn't actually a 64-bit address space, the offset is stored in the high-order 3 bits of the 64-bit word. So char* and int* are the same size, but have different internal representations -- and code that assumes that pointers are "really" just integers can fail badly.

char * and int * have different representations on Cray T90, so:

int *i = <some int pointer value>;
char *c = reinterpret_cast<char *>(i);

Here, i and c will have differing representations on Cray T90 (and doing this conversion definitely uses CPU instructions).

(I've verified this, chapter 3.1.2.7.1 of Cray C/C++ Reference Manual SR–2179 2.0)

geza
  • 28,403
  • 6
  • 61
  • 135
  • 1
    I still don't see how reinterpret_cast is supposed to magically do these compile-time type conversions. And this question can be quite simply answered by someone posting the emitted assembler code that illustrates when it does. –  Aug 25 '18 at 23:09
  • @NeilButterworth: what do you mean by "magically"? In my example, we have a pointer. It is converted to integer. The mapping is implementation defined, can be anything. So, representation can be anything as well, i.e., it can change. End of story. – geza Aug 25 '18 at 23:20
  • 2
    @NeilButterworth: Given `#include ` / `uint64_t foo(void *x) { return reinterpret_cast(x); }`, Apple LLVM 9.1.0 (clang-902.0.39.2) invoked with `c++ -O3 -m32 -S` generates `movl 8(%ebp), %eax` / `xorl %edx, %edx`, thus taking as input a four-byte object and producing as output an eight-byte object. – Eric Postpischil Aug 26 '18 at 00:01
  • Note: notes are non-normative. Compilers are free to disregard them and they are still C++ standard compliant. – Yakk - Adam Nevraumont Aug 26 '18 at 00:11
  • @Yakk-AdamNevraumont: I've seen this note several times, where is this information written? I mean, which part is normative and which is not in the standard (supposedly parts between '[' and ']' are non-normative)? I haven't found it in the C++ standard itself. Is it in some referenced document? – geza Aug 26 '18 at 00:22
  • @gaza https://stackoverflow.com/q/21364398/1774667 -- it is an iso standard thing – Yakk - Adam Nevraumont Aug 26 '18 at 00:23
  • @Yakk-AdamNevraumont: Thanks! For almost any question, there is an answer at SO already :) – geza Aug 26 '18 at 00:27
  • Could you also add a somewhat unexpected case on word-addressed architectures, like explained in [this comment](https://stackoverflow.com/questions/399003/is-the-sizeofsome-pointer-always-equal-to-four/399122#comment14833666_399122)? – IInspectable Aug 26 '18 at 21:20
  • @IInspectable: nice find, I've added some information to my answer. – geza Aug 26 '18 at 22:51
  • _" if we reinterpret_cast ptr to a wider integer"_ -- the MSVC 14.1 compiler even does a [sign-extension of the pointer value](https://stackoverflow.com/q/43035539/7571258), a good example of how the representation can change ! – zett42 Aug 26 '18 at 23:37
-3

You are correct that reinterpret_cast does not change the bit values, however that doesn't mean the resulting value doesn't change.

One simple example would be casting a 32-bit integral type to a char[4] with each element representing one octet of an IPv4 address.

SoronelHaetir
  • 14,104
  • 1
  • 12
  • 23