Why does invoking a function with a pointer to one member variable inhibit optimization of other member variable?

Question

Consider the following program:

struct X {
    bool b;
    int y;
};

void f(int*);

bool test(X x) {
    if (!x.b) {
        return false;
    }
    f(&x.y);
    return x.b;
}

It seems that the statement

    return x.b;

does not get optimized to return true; So my conclusion is, that the compiler must assume that f(int*) could modify x.b by doing this:

void f(int* p) {
    bool* q = reinterpret_cast<bool*>(p - 1);
    *q = false;
}

Is this implementation of f actually valid C++? Or differently phrased, must the compiler always assume that the entire object may change when a pointer to a member is passed to an opaque function?

Compiler output with clang-x86 trunk (-O1) is this:

test(X):                              # @test(X)
        push    rax
        mov     qword ptr [rsp], rdi
        test    dil, dil
        je      .LBB0_1
        lea     rdi, [rsp + 4]
        call    f(int*)@PLT
        cmp     byte ptr [rsp], 0
        setne   al
        pop     rcx
        ret
.LBB0_1:
        xor     eax, eax
        pop     rcx
        ret

See live demo here

One interpretation I can come up with, is that the given definition of f is valid C++ if I only ever call it like in the example above and so the compiler must assume that possibility.

No, such implementation of `f` is straight up UB. Now whether `return true;` is faster than `return x.b;` might be up to debate... — Yksisarvinen, Jul 19 '23 at 10:20
@Yksisarvinen So what you are saying is, that a conforming compiler could assume that the call `f(&x.y)` (whatever the definition of `f`) cannot change `x.b`? (Considering that `x` is not aliased since it's a local variable) — chrysante, Jul 19 '23 at 10:24
Conforming compiler should assume that aliasing any object field leads to aliasing of the entire object. And it is actually a relatively common practice, for example it is used in boost and in php. `X * p_object{reinterpret_cast(reinterpret_cast(p) - offsetof(X, y))}; ...` — user7860670, Jul 19 '23 at 10:26
@user7860670 Thank you, that's what I wanted to know. Could you perhaps point me to the relevant section in the standard? — chrysante, Jul 19 '23 at 10:27
btw. the impl of f - really is UB. subtracting an int pointer to get the bool field... This usually works because of 3 padding bytes, but... It would not work if the platform says the alignment requirement for int is 1 - than you would access data before the struct. — Bernd, Jul 19 '23 at 10:31
@Bernd Yes, I guess I should at least be using the `offsetof` macro, see user7860670's comment for reference — chrysante, Jul 19 '23 at 10:35
I'm almost a bit surprised (disappointed?) that the compiler didn't optimise to `if(x.b) { f(&x.y); } return x.b;`, which is what I'd have written right in the first place anyway... — Aconcagua, Jul 19 '23 at 10:51
@Aconcagua I guess if `x.b` is very unlikely, then this would be a pessimization because the memory would be read twice. — chrysante, Jul 19 '23 at 10:53
@chrysante Well... looks as if GCC [does so](https://godbolt.org/z/xdPTxK1En) with -O3 ;) — Aconcagua, Jul 19 '23 at 11:00

score 8 · Answer 1 · answered Jul 19 '23 at 10:27

The shown implementation of f is certainly undefined behavior. However, the compiler not only owes conformity to the C++ language, but also to platform specific behaviors that do permit that x.b is changed.

The compiler does not see how f is implemented. What if it was written in assembly? Such an implementation certainly may do what it wants, including changing x.b without violating any specifications and standards. Hence, it must assume the worst and re-read x.b.

user7860670 · Accepted Answer · 2023-07-19T10:40:50.100

5

While the shown implementation of f leads to undefined behavior, it can be implemented in correct manner:

#include <type_traits>
#include <cstddef>

void f(int* p)
{
   static_assert(::std::is_standard_layout_v<X>);
   X * p_object{reinterpret_cast<X *>(reinterpret_cast<unsigned char *>(p) - offsetof(X, y))}; 
   p_object->b = false;
}

So compiler may assume that aliasing any field of object of standard layout type leads to aliasing of entire object.

edited Jul 19 '23 at 10:40

answered Jul 19 '23 at 10:35

user7860670

35,849
4
58
84

1

I assume this question is related: https://stackoverflow.com/questions/18385020/can-code-that-will-never-be-executed-invoke-undefined-behavior Behaviour is well defined as long as `p` always points to an `X::y` member when `f` is invoked. – chrysante Jul 19 '23 at 10:40
2

Also in practice any real-life compiler should probably assume that aliasing any field of object even of non-standard layout type leads to aliasing of entire object. For example boost heavily relies on stuff such as [parent_from_member](https://www.boost.org/doc/libs/1_82_0/boost/intrusive/detail/parent_from_member.hpp). I think that uncontrolled aliasing is actually one of the major limiting factor of the performance of C++ programs (compared to Fortran for example, where, as far as i know, compiler is free to assume no aliasing). – user7860670 Jul 19 '23 at 10:53

Quimby · Answer 3 · 2023-07-19T10:46:25.773

You rely on padding in your implementation so that is definitely UB.

I believe the following implementation is not undefined behaviour if called with f(&x.y).

void f(int* ptr){
    X* x = reinterpret_cast<X*>(reinterpret_cast<char*>(ptr)-offsetof(struct X,y));
    x->b=true;
}

Anyway, Linux kernel itself heavily relies on the struct being reachable from a pointer to its member via container_of macro so regardless what behaviour is defined or not, mainstream compilers do assume reachability and will not optimize this case.

Why does invoking a function with a pointer to one member variable inhibit optimization of other member variable?

3 Answers3