1

I'm working on exercise 3.61 of CSAPP, which requires to write a very simple function that checks if a pointer is NULL before trying to dereference it, which should base on a conditional move instruction rather than a jump. Here's an example I've found online:

long cond(long* p) {
    return (!p) ? 0 : *p;
}

According to claims, the function can be compiled into the following assembly:

cond:
    xor eax, eax
    test rdi, rdi
    cmovne rax, QWORD PTR [rdi]
    ret

I am running GCC 7.3.0 (from APT package gcc/bionic-updates,now 4:7.3.0-3ubuntu2.1 amd64) on Ubuntu 18.04 on WSL. The computer is running on an Intel Coffee Lake (i.e. 8th-gen Core-i) processor.

I have tried the following commands:

gcc -S a.c -O3
gcc -S a.c -O3 -march=x86-64
gcc -S a.c -O3 -march=core2
gcc -S a.c -O3 -march=k8

Honestly, I wasn't able to observe any difference in the generated a.s file, since all of them look like

cond:
    xorl    %eax, %eax
    testq   %rdi, %rdi
    je      .L1
    movq    (%rdi), %rax
.L1:
    ret

Is there any possibility to have such a function that compiles into a conditional move, without a jump?


Edit: As told in comments, the CMOVxx series of instructions loads operands unconditionally, and only the actual assignment operation is conditional, so there'd be no luck to put *p (or (%rdi)) as the source operand of CMOV, is it right?

The claim is on this page but I think it's invalid.

iBug
  • 35,554
  • 7
  • 89
  • 134
  • 6
    The optimisation is invalid as cmov performs an unconditional load from its source operand, faulting even if it does not end up being used. – fuz Feb 06 '19 at 16:32
  • Edited code does not solve it. The source location is accessed even if the condition is false. – Eugene Sh. Feb 06 '19 at 16:36
  • thanks guys I think I have a better understanding now. But does anyone know if there's any chance of getting a CMOV instruction? – iBug Feb 06 '19 at 16:38
  • 1
    @iBug Like *any* `cmov`? For just some C code? Here: https://godbolt.org/z/UkonTX – Eugene Sh. Feb 06 '19 at 16:41
  • @EugeneSh. I assume its functionality should be retained, i.e. dereference a pointer, return 0 if NULL> – iBug Feb 06 '19 at 16:43
  • 1
    @iBug Your updated code is not producing `cmove` for me here: https://godbolt.org/z/C_cGjB , but it does with `-O1`. Also note, that you don't need `z` to be `static`. – Eugene Sh. Feb 06 '19 at 17:00
  • You could first conditionally overwrite the pointer with a dummy pointer if it is NULL and then conditionally move the pointee to the result. E.g. `xor %eax, %eax; test %rdi, %rdi; cmovz %rsp, %rdi; cmovnz (%rdi), %rax`. This could perform slightly better than the jump but I think it only does if the jump is poorly predicted. – fuz Feb 06 '19 at 17:23
  • 1
    Beyond the correctness issue pointed out by fuz and others, even if you could use a conditional-move, it's very unlikely to be an optimization over a simple branch. Why? Branch prediction. Branch prediction on modern processors works in your favor as long as the branches are predictable, and null pointer checks are quite predictable. You'll find that Intel's compiler prefers to just test-and-branch in such cases, even when it could use a conditional move. – Cody Gray - on strike Feb 06 '19 at 18:39

1 Answers1

0

Here is a branch-less version:

inline long* select(long* p, long* q) {
    uintptr_t a = (uintptr_t)p;
    uintptr_t b = (uintptr_t)q;
    uintptr_t c = !a;
    uintptr_t r = (a & (c - 1)) | (b & (!c - 1));
    return (long*)r;
}

long cond(long* p) {
    long t = 0;
    return *select(p, &t);
}

Assembly for gcc-8.2:

cond(long*):
        mov     QWORD PTR [rsp-8], 0
        xor     eax, eax
        test    rdi, rdi
        sete    al
        lea     rdx, [rax-1]
        neg     rax
        and     rdi, rdx
        lea     rdx, [rsp-8]
        and     rax, rdx
        or      rdi, rax
        mov     rax, QWORD PTR [rdi]
        ret

Assembly for clang-7:

cond(long*):                              # @cond(long*)
        mov     qword ptr [rsp - 8], 0
        xor     eax, eax
        test    rdi, rdi
        lea     rcx, [rsp - 8]
        cmovne  rcx, rax
        or      rcx, rdi
        mov     rax, qword ptr [rcx]
        ret
Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271