1

I know about the memcpy/memmove to a union member, does this set the 'active' member? question , but I guess my question is different. So:

Suppose sizeof( int ) == sizeof( float ) and I have the following code snippet:

union U{
    int i;
    float f;
};

U u;
u.i = 1; //i is the active member of u

::std::memcpy( &u.f, &u.i, sizeof( u ) ); //copy memory content of u.i to u.f

My questions:

  1. Does the code lead to an undefined behaviour (UB)? If yes why?
  2. If the code does not lead to an UB, what is the active member of u after the memcpy call and why?
  3. What would be the answer to previous two questions if sizeof( int ) != sizeof( float ) and why?
timrau
  • 22,578
  • 4
  • 51
  • 64
cerveka2
  • 156
  • 7

4 Answers4

3

You are not allowed to use memcpy to copy overlapping regions of memory:

If the objects overlap, the behavior is undefined.

Your code has undefined behavior because of this violation of memcpy's precondition, as u.f and u.i occupy the same address in memory.

ComicSansMS
  • 51,484
  • 14
  • 155
  • 166
3

Regardless of the union, the behaviour of std::memcpy is undefined if the source and destination overlap. This is the case for every member of the union, and it would not be different if the sizes weren't the same.

If you were to use std::memmove instead, there is no longer an issue due to the overlap, and it also doesn't matter that you copy from a member of a union. Since both types are trivially copyable, the behaviour is defined and u.f becomes the active member of the union, but the union holds the same bytes as before in practice.

The only issue would arise if sizeof(U) was larger than sizeof(int), because you would be copying potentially uninitialized bytes. This is undefined behaviour.

IS4
  • 11,945
  • 2
  • 47
  • 86
  • There is no reason that a compiler shouldn't be able to act as though the destination will simultaneously contain objects of every type for which the copied bit pattern would be valid, at least until the next time it is modified using any member of the union, but that would require fixing the broken object abstraction model encapsulated in the C++ Standard and the clang/gcc optimizers. – supercat Sep 02 '22 at 22:30
1

You can use std::bit_cast to switch the active members in a union. Additionally, it's constexpr so you can even use the union in a core constant calculation.

#include <memory>
#include <bit>

union U {
    int i;
    float f;
    constexpr void switch_to_int() { this->i = std::bit_cast<int>(f); }
    constexpr void switch_to_float() { this->f = std::bit_cast<float>(i); }
};

constexpr int foo() {
    U u{};
    u.f = 2.0f;
    u.switch_to_int();
    return  u.i;
}

int main()
{
    constexpr int i = foo();
}

Compiler Explorer

std::bit_cast uses memcpy under the hood and compilers do a great job of optimizing the code. In this case memcpy is used to create an r-value that is then written to the new, active union element. This is what's left of the call to foo() at runtime.

mov eax,40000000h

doug
  • 3,840
  • 1
  • 14
  • 18
0
  1. Yes, it's undefined behaviour (UB). Because &u.f and &u.i points to the same start address. See the definition of memcpy:
void *   memcpy (void *__restrict, const void *__restrict, size_t);

The C99 keyword restrict is an indication to the compiler that different object pointer types and function parameter arrays do not point to overlapping regions of memory.

This enables the compiler to perform optimizations that might otherwise be prevented because of possible aliasing.

It is your responsibility to ensure that restrict-qualified pointers do not point to overlapping regions of memory.

__restrict, permitted in C90 and C++, is a synonym for restrict.

  1. Because it is UB. The result are undefined.

  2. Also UB. Because &u.f and &u.i always points to the same start address, regardless equality of their lengths. sizeof(U) will get the maximum size of all the members of the union.

Zongru Zhan
  • 546
  • 2
  • 8