1

Let's assume there's the following code:

#include <stdbool.h>

typedef struct {
    bool a;
    bool b;
} MyStruct2;

bool g(MyStruct2 *s) {
    return s->a || s->b;
}

bool g2(MyStruct2 *s) {
    return s->a | s->b;
}

int main() {

    return 0;
}

Which compiles into this:

g:
        movzx   eax, BYTE PTR [rdi]
        test    al, al
        jne     .L1
        movzx   eax, BYTE PTR [rdi+1]
.L1:
        ret


g2:
        movzx   eax, BYTE PTR [rdi]
        or      al, BYTE PTR [rdi+1]
        ret


main:
        xor     eax, eax
        ret

g2 seems to be shorter and it does not include any jump. So why does gcc not optimize g to the same code as g2? None of the members of MyStruct2 is volatile (or otherwise special), so it should be safe to evaluate s->b in g in all cases (even if s->a is true and it would not be required to evaluate s->b).

Why doesnt gcc produce the shorter code without a jump?

Thanks

phuclv
  • 37,963
  • 15
  • 156
  • 475
Kevin Meier
  • 2,339
  • 3
  • 25
  • 52
  • 2
    AFAIK, it is a missed optimization as reading `s->b` due to the lazy evaluation should not cause any side effect (it is not a volatile variable and the pointer indirection is already done anyway). Compilers should be able to do this. In practice, neither GCC, nor Clang, nor ICC does this which is disappointing (especially since this kind of missed optimization often prevent the vectorization of boolean-based code)... – Jérôme Richard Sep 02 '22 at 16:38
  • If they weren't right next to each other, branching could avoid a possible cache miss if the load in the RHS was cold in cache. IDK if that's why GCC is avoiding it. They're members of the same struct object, and evaluating `s->a` guarantees that a valid `MyStruct2` object exists there, so it's not a correctness problem to unconditionally do the 2nd deref. (It would be if it was an unrelated pointer that might be NULL). Or a duplicate of [Boolean values as 8 bit in compilers. Are operations on them inefficient?](https://stackoverflow.com/q/47243955) - missed opts even without pointers. – Peter Cordes Sep 03 '22 at 13:32
  • Oh, maybe not a duplicate; https://godbolt.org/z/hascsoqbv shows that with two `bool` args by value, no pointers, current GCC does optimize `a||b` the same as `a|b`. – Peter Cordes Sep 03 '22 at 13:36
  • [Why does Clang generate different code for reference and non-null pointer arguments?](https://stackoverflow.com/q/66298438) is the same situation with a struct of two `int`s, with compilers failing to optimize. I think that works as a duplicate (@JérômeRichard) – Peter Cordes Sep 03 '22 at 14:00

0 Answers0