gcc: avoiding strict-aliasing violation warning by explicit memcpy

Question

I have a class that takes 64 bit in memory. To implement equality, I used reinterpret_cast<uint64_t*>, but it results in this warning on gcc 7.2 (but not clang 5.0):

$ g++ -O3 -Wall -std=c++17 -g -c example.cpp 
example.cpp: In member function ‘bool X::eq_via_cast(X)’:
example.cpp:27:85: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
     return *reinterpret_cast<uint64_t*>(this) == *reinterpret_cast<uint64_t*>(&x);                                                                                     ^

From my understanding, casting is undefined behavior unless you are casting to the actual type or to char*. For instance, there could be architecture specific layout restricts while loading values. That is why I tried alternative approaches.

Here is the source code of a simplified version (link to godbolt):

#include <cstdint>
#include <cstring>

struct Y
{
    uint32_t x;
    bool operator==(Y y) { return x == y.x; }
};

struct X
{
    Y a;
    int16_t b;
    int16_t c;

    uint64_t to_uint64() {
        uint64_t result;
        std::memcpy(&result, this, sizeof(uint64_t));
        return result;
    }

    bool eq_via_memcpy(X x) {
        return to_uint64() == x.to_uint64();
    }

    bool eq_via_cast(X x) {
        return *reinterpret_cast<uint64_t*>(this) == *reinterpret_cast<uint64_t*>(&x);
    }

    bool eq_via_comparisons(X x) {
        return a == x.a && b == x.b && c == x.c;
    }
};
static_assert(sizeof(X) == sizeof(uint64_t));

bool via_memcpy(X x1, X x2) {
    return x1.eq_via_memcpy(x2);
}

bool via_cast(X x1, X x2) {
    return x1.eq_via_cast(x2);
}

bool via_comparisons(X x1, X x2) {
    return x1.eq_via_comparisons(x2);
}

Avoiding the cast by explicitly copying the data via memcpy prevents the warning. As far as I understand it, it should also be portable.

Looking at the assembler (gcc 7.2 with -std=c++17 -O3), memcpy is optimized perfectly while the straightforward comparisons lead to less efficient code:

via_memcpy(X, X):
  cmp rdi, rsi
  sete al
  ret

via_cast(X, X):
  cmp rdi, rsi
  sete al
  ret

via_comparisons(X, X):
  xor eax, eax
  cmp esi, edi
  je .L7
  rep ret
.L7:
  sar rdi, 32
  sar rsi, 32
  cmp edi, esi
  sete al
  ret

Very similar with clang 5.0 (-std=c++17 -O3):

via_memcpy(X, X): # @via_memcpy(X, X)
  cmp rdi, rsi
  sete al
  ret

via_cast(X, X): # @via_cast(X, X)
  cmp rdi, rsi
  sete al
  ret

via_comparisons(X, X): # @via_comparisons(X, X)
  cmp edi, esi
  jne .LBB2_1
  mov rax, rdi
  shr rax, 32
  mov rcx, rsi
  shr rcx, 32
  shl eax, 16
  shl ecx, 16
  cmp ecx, eax
  jne .LBB2_3
  shr rdi, 48
  shr rsi, 48
  shl edi, 16
  shl esi, 16
  cmp esi, edi
  sete al
  ret
.LBB2_1:
  xor eax, eax
  ret
.LBB2_3:
  xor eax, eax
  ret

From this experiment, it looks like the memcpy version is the best approach in performance critical parts of the code.

Questions:

Is my understanding correct that the memcpy version is portable C++ code?
Is it reasonable to assume that the compilers are able to optimize away the memcpy call like in this example?
Are there better approaches that I have overlooked?

Update:

As UKMonkey pointed out, memcmp is more natural when doing bitwise comparisons. It also compiles down to the same optimized version:

bool eq_via_memcmp(X x) {
    return std::memcmp(this, &x, sizeof(*this)) == 0;
}

Here is the updated godbolt link. Should also be portable (sizeof(*this) is 64 bit), so I assume it is the best solution so far.

You're relying on a particular struct layout in memory, so that presumably will limit portability. — Oliver Charlesworth, Jan 02 '18 at 12:21
So what you're really trying to do is a bitwise compare of the memory of the classes; that happens to be 64 bit? Why not use `memcmp(this, other, sizeof(X))`? — UKMonkey, Jan 02 '18 at 12:21
@UKMonkey That is a really good point. I have completely overlooked memcmp. Yes, I think that is an improvement over memcpy. — Philipp Claßen, Jan 02 '18 at 12:42
If you want to do this, I suggest at least asserting `has_unique_object_representations_v`. — T.C., Jan 02 '18 at 13:10
@T.C. Was not aware of has_unique_object_representations_v, but it seems to guarantee the object has no padding bytes (https://stackoverflow.com/q/42855326/783510). Having some static_assert in place makes sense, otherwise memcpy will break if the class is extended. — Philipp Claßen, Jan 02 '18 at 13:58
Note that in general, using memcmp for structs is unsafe - it doesn't into account the indeterminate value of padding bits/bytes. — Oliver Charlesworth, Jan 02 '18 at 23:59
[memcpy should be what you want](https://stackoverflow.com/a/51228315/1708801) we may get [bit_cast](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0476r2.html) in C++20. — Shafik Yaghmour, Jul 09 '18 at 06:18
_"From my understanding, casting is undefined behavior unless you are casting to the actual type or ..."_ - this is a common misunderstanding, but no; the cast itself is usually fine, it's de-referencing the resulting pointer that is the problem (as the warning from gcc says). — davmac, Jul 16 '18 at 13:51

Philipp Claßen · Accepted Answer · 2018-07-16T12:56:46.143

In C++17, memcmp in combination with has_unique_object_representations can be used:

bool eq_via_memcmp(X x) {
    static_assert(std::has_unique_object_representations_v<X>);
    return std::memcmp(this, &x, sizeof(*this)) == 0;
}

Compilers should be able to optimize it to one comparison (godbolt link):

via_memcmp(X, X):
  cmp rdi, rsi
  sete al
  ret

The static assertion makes sure that the class X does not contain padding bits. Otherwise, comparing two logically equivalent objects could return false because the content of the padding bits may differ. In that case, it is safer to reject that code at compile time.

(Note: Presumably, C++20 will add std::bit_cast, which could be used as an alternative for memcmp. But still, you have to make sure that no padding is involved for the same reason.)

gcc: avoiding strict-aliasing violation warning by explicit memcpy

1 Answers1