4

I want to stick the bits of an int32_t into the type uint32_t without any transformation, just a reinterpretation. The following code does precisely what I want:

int32_t  iA = -1;
uint32_t uA = *(uint32_t*)&iA;

But I was wondering, can I rely on the following easier to write cast generating the same (or less) assembly, ideally just movs? (i.e., it'll never do "math" to it, leaving the underlying bits untouched.)

int32_t  iB = -1;
uint32_t uB = (uint32_t)iB;

assert(uA == uB); // ?
Evg
  • 25,259
  • 5
  • 41
  • 83
Anne Quinn
  • 12,609
  • 8
  • 54
  • 101
  • Your comparison itself is changing them. https://stackoverflow.com/questions/8233161/compare-int-and-unsigned-int As to whether the cast itself adds any work to what's already being done, probably not, but I don't know. – Kenny Ostrom Jan 05 '20 at 14:11
  • @KennyOstrom - oh no, the assert is just there to say I want the first cast to be equal to the second cast (it's comparing two uint) – Anne Quinn Jan 05 '20 at 14:13
  • You can check the generated assembly yourself on https://gcc.godbolt.org/ – HolyBlackCat Jan 05 '20 at 14:13
  • @HolyBlackCat - I have, the second cast generates one fewer instructions (it's roughly the same though), but I wonder if that's just a coincidence of how the compiler does things, or if the standard requires it – Anne Quinn Jan 05 '20 at 14:14
  • 1
    @AnneQuinn The standard doesn't mention assembly at all, it only describes how programs should behave. I got the same results (one instruction less) on GCC, and Clang generated the same assembly for both. – HolyBlackCat Jan 05 '20 at 14:16
  • If you got a different result then you probably compiled without optimizations, which is a meaningless test. – interjay Jan 05 '20 at 14:21
  • 5
    I would use `memcpy`, that's guaranteed to not change any bits. – alain Jan 05 '20 at 14:21
  • 1
    `uint32 uA = *(uint32*)&iA;` Type punning like this is almost always undefined behaviour in C++ – though I’m not completely certain in this case. Anyway, do not even think about doing something like that unless you know for certain and have double checked that it is explicitly allowed by the language or your compiler. On the other hand casting fundamental types is always safe, although I’d suggest a `static_cast`. This isn’t C, after all. – besc Jan 05 '20 at 14:23
  • @HolyBlackCat - That's okay, I'm not hung up on the asm itself, I just want to avoid transforming the bits, (for example, if `int32` is instead `float32`, it totally changes the bit pattern before copying, in the second cast, but not the first) – Anne Quinn Jan 05 '20 at 14:23
  • 3
    @besc it is [OK to alias signed/unsigned](https://stackoverflow.com/questions/48060240/can-an-int-be-aliased-as-an-unsigned-int) types. – rustyx Jan 05 '20 at 14:25
  • 2
    @AnneQuinn It's always a no-op on two's-complement architectures (i.e. on all modern ones). Also C++20 will require two's-complement. – HolyBlackCat Jan 05 '20 at 14:34

2 Answers2

7

Until C++20, the representation of signed integers is implementation-defined. However, std::intX_t are guaranteed to have 2s'-complement representation even before C++20:

int8_t, int16_t, int32_t, int64_t - signed integer type with width of exactly 8, 16, 32 and 64 bits respectively with no padding bits and using 2's complement for negative values (provided only if the implementation directly supports the type)

When you write

std::int32_t  iA = -1;
std::uint32_t uA = *(std::uint32_t*)&iA;

you get the value with all bits set. The standard says that accessing std::int32_t through a pointer of type std::uint32_t* is permitted if "type is similar to ... a type that is the signed or unsigned type corresponding to the dynamic type of the object". Thus, strictly speaking, we have to ensure that std::uint32_t is indeed an unsigned type corresponding to std::int32_t before dereferencing the pointer:

static_assert(std::is_same_v<std::make_unsigned_t<std::int32_t>, std::uint32_t>);

When you write

std::int32_t  iB = -1;
std::uint32_t uB = (std::uint32_t)iB;

you rely on the conversion into the unsigned type that is well-defined and is guaranteed to produce the same value.

As for the assembly, both casts are no-ops:

std::uint32_t foo() {
    std::int32_t  iA = -1;
    static_assert(std::is_same_v<std::make_unsigned_t<std::int32_t>, std::uint32_t>);
    return *(std::uint32_t*)&iA;
}

std::uint32_t bar() {
    std::int32_t  iB = -1;
    return (std::uint32_t)iB;
}

result in:

foo():
        mov     eax, -1
        ret
bar():
        mov     eax, -1
        ret
Evg
  • 25,259
  • 5
  • 41
  • 83
3

Using memcpy is a common solution to avoid undefined behavior when aliasing types. It was pointed out in the comments that aliasing types which differ only in their signedness is ok, but this would not be the case with float and int for example.

memcpy works as long as the object representation is valid for the type.

Compilers are very good at optimizing memcpy calls, in this case the call is completely optimized away.

alain
  • 11,939
  • 2
  • 31
  • 51
  • if I could accept multiple answers... I ended up using `memcpy`, static cast for type punning was a bad idea anyway, but I worded the question way too narrowly – Anne Quinn Jan 05 '20 at 15:47