I wrote this minimal code in C++ that takes 3 numbers: a
,b
and c
and a bitmask r
. Then, it has a result L
which should be equal to c
if second bit in r
is set, otherwise equal to b
if first bit in r
is set and finally a
if first 2 bits in r
are both not set. I want to use assembly to optimize it and GCC (g++) to compile it and this is my code:
#include <cstdio>
#include <cstdlib>
int main(){
uint a=1;
uint b=2;
uint c=3;
uint r=1;
uint L;
asm(
"mov %2,%0;"
"bt $0,%1;"
"cmovc %3,%0;"
"bt $1,%1;"
"cmovc %4,%0;"
: "=r" (L)
: "r" (r), "r" (a), "r" (b), "r" (c)
);
printf("%d\n",L);
return 0;
}
In the setup above, L
should be equal to b
, however, no matter with what parameters I try to compile it with, the printed value is always 3, i.e. c
. Why is that and how do I write this program correctly?
EDIT: This question is already answered here, but I still want to post an answer to this question because it can only help others. I will write it here since I am forbidden to post it as an actual answer, properly:
It turns out that the code is just fine unles I use -O3
flag, where when I use -O3
, the compiler decides to mess up like this:
In this minimal example, it decides to store a
and r
in the same register, then it stores L
to a
or b
, I am unsure. Anyway, it overwrites registers which it shouldn't.
In my actual code where I wanted to apply this assembly, the L
variable is actually a reference given as an argument to a function. Now the compiler decided to store some of a
,b
or c
to L
as a way to optimize the code, ignoring completely that L
already has a value.
This happens because my assembly snippet doesn't know that it should keep the value of L
in its place because I told him that the value is "=r"
(write-only) instead of "+r"
(read-write).
Also, r
should be moved to output operands, again with "+r"
because even though bt
won't change it, it still understands it as an output operand.