1

I am learning GCC's extended inline assembly currently. I wrote an A + B function and wants to detect the ZF flag, but things behave strangely.

The compiler I use is gcc 7.3.1 on x86-64 Arch Linux.

I started from the following code, this code will correctly print the a + b.

int a, b, sum;
scanf("%d%d", &a, &b);
asm volatile (
  "movl %1, %0\n"
  "addl %2, %0\n"
  : "=r"(sum)
  : "r"(a), "r"(b)
  : "cc"
);
printf("%d\n", sum);

Then I simply added a variable to check flags, it gives me wrong output.

int a, b, sum, zero;
scanf("%d%d", &a, &b);
asm volatile (
  "movl %2, %0\n"
  "addl %3, %0\n"
  : "=r"(sum), "=@ccz"(zero)
  : "r"(a), "r"(b)
  : "cc"
);
printf("%d %d\n", sum, zero);

The GAS assembly output is

  movl  -24(%rbp), %eax  # %eax = a
  movl  -20(%rbp), %edx  # %edx = b
#APP
# 6 "main.c" 1
  movl %eax, %edx
  addl %edx, %edx

# 0 "" 2
#NO_APP
  sete  %al
  movzbl  %al, %eax
  movl  %edx, -16(%rbp)  # sum = %edx
  movl  %eax, -12(%rbp)  # zero = %eax

This time, the sum will become a + a. But when I just exchanged %2 and %3, the output will be correct a + b.

Then I checked various gcc version (It seems clang does not support it when output is a flag) on wandbox.org, from version 4.5.4 to version 4.7.4 gives the correct result a + b, and starting from version 4.8.1 the outputs are all a + a.

My question is: did I write the wrong code or is there anything wrong with gcc?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
xris
  • 65
  • 1
  • 7
  • 1
    The problem is that you clobber %0 before all the inputs are consumed. The optimizer is allowed to use the same register for an input constraint as an output constraint. To avoid this you need to make input constraint %0 an [early clobber](https://gcc.gnu.org/onlinedocs/gcc/Modifiers.html). To do that change `"=r"(sum)` to `"=&r"(sum)` – Michael Petch Mar 29 '18 at 13:11
  • @MichaelPetch thank you for your reply. Does this mean I have to use other registers to save the result before using %0, or simply constraint which specific register to save the result, If I do not use early clobber? – xris Mar 29 '18 at 13:42
  • If you don't want to use an early clobber modifier then you would have to specify a specific register for the input constraints and a specific register for the output constraint (all different registers). You wouldn't be able to use `"=r"` and `"r"` to allow the compiler to choose free registers automatically which generates less efficient code. – Michael Petch Mar 29 '18 at 13:47
  • @MichaelPetch Thank you very much again. My question was solved. But why don't you just post that as an answer rather than a comment? – xris Mar 29 '18 at 14:07
  • Basically a duplicate of [When to use earlyclobber constraint in extended GCC inline assembly?](https://stackoverflow.com/q/15819794) – Peter Cordes Jan 27 '21 at 18:41

1 Answers1

2

The problem is that you clobber %0 before all the inputs (%2 in your case) are consumed:

"movl %1, %0\n"
"addl %2, %0\n"

%0 is being modified by the first MOV before %2 has been consumed. It is possible for an optimizing compiler to re-use a register for an input constraint that was used for an output constraint. In your case one of the compilers chose to use the same register for %2 and %0 which caused the erroneous results.

To get around this problem of changing a register that is being modified before all the inputs are consumed is to mark the output constraint with a &. The & is a modifier denoting Early Clobber:

‘&’ Means (in a particular alternative) that this operand is an earlyclobber operand, which is written before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is read by the instruction or as part of any memory address.

‘&’ applies only to the alternative in which it is written. In constraints with multiple alternatives, sometimes one alternative requires ‘&’ while others do not. See, for example, the ‘movdf’ insn of the 68000.

A operand which is read by the instruction can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written. Adding alternatives of this form often allows GCC to produce better code when only some of the read operands can be affected by the earlyclobber. See, for example, the ‘mulsi3’ insn of the ARM.

Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it’s used.

‘&’ does not obviate the need to write ‘=’ or ‘+’. As earlyclobber operands are always written, a read-only earlyclobber operand is ill-formed and will be rejected by the compiler.

The change to your code would be to modify "=r"(sum) to be "=&r"(sum). This will prevent the compiler from using the register used for the output constraint for one of the input constraints.

Word of warning. GCC Inline Assembly is powerful and evil. Very easy to get wrong if you don't know what you are doing. Only use it if you must, avoid it if you can.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198