5

GCC's inline assembler recognizes the declarators =r and =&r. These make sense to me: the =r lets the assembler reuse an input register for output.

However, GCC's inline assembler also recognizes the declarators +r and +&r. These make less sense to me. After all, isn't the distinction between +r and +&r a distinction without a difference? Does the +r alone not suffice to tell the compiler to reserve a register for the sole use of a single variable?

For example, what is wrong with the following GCC code?

#include <stdio.h>
int main()
{
    int a = 0;
    printf("Initially, a == %d.\n", a);
    /* The architecture is amd64/x86-64. */
    asm(
        "inc %[a]\n"
        : [a] "+r" (a)
        : : "cc"
    );
    printf("However, after incrementation, a == %d.\n", a);
    return 0;
}

Notice incidentally that my inline assembly lacks an input declaration because, in my (perhaps mistaken) mind, the +r covers input, clobbering, output, everything. What have I misunderstood, please?

BACKGROUND

I have programmed 8- and 16-bit microcontrollers in assembly a bit, but have little or no experience at coding assembly in a hosted environment.

old_timer
  • 69,149
  • 8
  • 89
  • 168
thb
  • 13,796
  • 3
  • 40
  • 68
  • 1
    You are right, It makes no difference. The compiler just happens to accept it anyway. – Jester Aug 26 '17 at 12:59
  • Why are you using inline assembly? – fuz Aug 26 '17 at 13:09
  • @fuz: I am using inline assembly as an exercise, to learn how it works. Reading Intel's x86 [Software Developer's Manual](https://software.intel.com/en-us/articles/intel-sdm), I understand better if I can try things out as I read. – thb Aug 26 '17 at 13:19
  • 1
    @thb I see. I recommend you to use Intel's [intrinsic functions](https://software.intel.com/sites/landingpage/IntrinsicsGuide/) instead. Inline assembly should be a last resort as it is notorious for being very fickle. – fuz Aug 26 '17 at 13:20
  • If you just want to try stuff out, why not use a standalone asm file (and a debugger) instead. – Jester Aug 26 '17 at 13:26
  • @fuz: I had not known that inline assembly is fickle. Indeed, I know little or nothing about it, so the advice is appreciated. – thb Aug 26 '17 at 13:29
  • 2
    The [documentation](https://gcc.gnu.org/onlinedocs/gcc/Modifiers.html#Modifiers) for the early clobber (`&`) modifier has this statement about its usage on operands that are both read/write: _Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it’s used._ – Michael Petch Aug 26 '17 at 13:31
  • 2
    @thb It's fickle in the sense that the compiler makes arbitrary registration choices within the constraints you give it, but if your inline assembly contains more than one or two statements, it's often possible for the compiler to randomly allocate registers in a way you did not anticipate. That's why I recommend to learn assembly first using either intrinsic functions or by writing assembly functions in a separate file. – fuz Aug 26 '17 at 13:39
  • @Jester: For whatever reason, while programming C++, I've never taken to debuggers. I'm not sure why. I've tried GDB a couple of times but have always ended up sending debugging messages to `std::cerr` instead. One does not dispute your advice, but to answer the question, before trying inline assembly, I installed NASM and read most of its manual. To use NASM seemed to want me to learn all about the linker's GOT, PLT, GOTPLT, PLTGOT, GOTOFF, etc., which was interesting but distracted too much from my study of Intel's manual. – thb Aug 26 '17 at 13:40
  • 1
    @MichaelPetch: you are perceptive. I had noticed that sentence. The sentence seems tautological to me, but maybe I am reading it wrong. After all, it's my inline assembly, so I'm the one who decides when the operand is written and used, am I not? The compiler knows nothing about such things as far as I am aware. I suppose that I don't see what the `+&r` tells the compiler that the compiler did not already know from a mere `+r`. – thb Aug 26 '17 at 13:59
  • 3
    One scenario where it may make a difference is if you have a read/write operand and a read only operand where the optimizer finds the value being passed in through both operands is the same. The optimizer could conceivably choose the same register for such a situation. If your inline assembly were to modify the register before the original registers value has been consumed then you may get the wrong result. The early clobber on the input/output operand would force the compiler to use two separate registers. – Michael Petch Aug 26 '17 at 14:14

1 Answers1

10

GCC assumes by default that inline assembly statements consist of a simple single assembly instruction that consumes all of its input operands before writing to any of its output operands. When writing inline assembly statements that use multiple assembly instructions this assumption is often broken, so the early clobber constraint modifier & needs to be used to indicate which output operands are written to before all the input operands are consumed. This is necessary with both output operands that use the = modifier and read/write output operands that use +. For example consider the two following functions:

int
foo() {
    int a = 1;
    asm("add %1, %0" : "+r" (a) : "r" (1));
    return a;
}

int
bar() {
    int a = 1;
    asm("add %1, %0\n\t"
        "add %1, %0"
        : "+r" (a) : "r" (1));
    return a;
}

Both inline assembly statements use the same operands and the same constraints, but the only the inline assembly statement in foo is correct, the one in bar is broken. With optimizations enabled GCC generates the following code for the two functions:

_foo:
    movl    $1, %eax
/APP
    add %eax, %eax
/NO_APP
    ret

_bar:
    movl    $1, %eax
/APP
    add %eax, %eax
    add %eax, %eax
/NO_APP
    ret

GCC sees no reason not to use the same register, EAX, for both operands in both inline assembly statements. While this isn't a problem in foo, it causes bar to calculate the wrong result of 4 instead of the expected 3.

A correct version of bar would use the early clobber modifier:

int
baz() {
    int a = 1;
    asm("add %1, %0\n\t"
        "add %1, %0"
        : "+&r" (a) : "r" (1));
    return a;
}
_baz:
    movl    $1, %eax
    movl    %eax, %edx
/APP
    add %edx, %eax
    add %edx, %eax
/NO_APP
    ret

When compiling baz GCC knows to use a different register for both operands so it doesn't matter that the read/write output operand is modified before the input operand is read for the second time.

Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
  • 3
    I'm the upvote. These are reasons why [using inline assembly should be avoided](https://gcc.gnu.org/wiki/DontUseInlineAsm) unless you know what you are doing. Many people don't consider what happens when code is optimized and the short cuts the compiler may take to streamline the input and output operands passed to extended assembly templates. It doesn't help that the GCC documentation is vague. Thanks, your example shows what my comment to the OP was about. – Michael Petch Aug 26 '17 at 15:20
  • Fascinating. Compiling various permutations of your example using `gcc -S -O2` illustrates the mechanics nicely. However, now I wonder whether `+r` has any legitimate use, whether one should not always prefer `=r` or `+&r`. – thb Aug 26 '17 at 17:11
  • @MichaelPetch: Your point is taken: "... unless you know what you are doing." I suppose that, by heeding your advice, I am trying to learn what I am doing. – thb Aug 26 '17 at 17:13
  • 1
    @thb I guess what I should say is that if you are doing production code then you better know what you are doing. If what you are doing isn't mission critical then no harm done. But in production code you should use intrinsics when available. If you don't want to deal with all the nuances of more complex extended inline assembly then putting it in a separate assembly objects (and link it) would be better. You lose the benefit of optimizations when it is possible but you probably gain in stability. – Michael Petch Aug 26 '17 at 18:29
  • 1
    @thb As for the `+r` usage. If you know that there are no other input operands that may be passed through the same register (an example: the case where the value being passed is the same) and that you won't be clobbering the operand before reading it - then you can get better optimized code since the same register can be reused for multiple operands. If you are new to using `+r` the safe bet is using "`+&r`" but the compiler may not be able to perform some optimizations using the `&` early clobber modifier. – Michael Petch Aug 26 '17 at 18:39
  • 1
    @tbh If you want to see an example of a discussion recently about a peculiar problem someone ran into you may wish to check this [SO Question and Answer](https://stackoverflow.com/q/45649101/3857942). It is related to passing an array (of no fixed length) as a memory operand. We determined what the compiler was doing, but it caught some of us off guard. it ended up with a post on the GCC mailing list to determine expected behaviour and the best way to handle it. Appears the result will be the future GCC inline assembly docs will be updated to be more clear and how to work around it. – Michael Petch Aug 26 '17 at 18:52
  • 1
    @tbh : My reason for mentioning the other issue is to show that sometimes the expected behavior and or the documentation are sometimes vague enough to cause confusion even for those who have been using inline assembly templates for some time. I use inline assembly very sparingly. In a code review inline assembly gets a lot of extra scrutiny from me because it can be the root of some very evil bugs that may show up when you least expect it. – Michael Petch Aug 26 '17 at 18:53
  • 2
    @thb I've been using GCC and it's form of inline assembly for almost 30 years now, and i'm also fairly familiar with the GCC internals that it's based on (though out of date) and I still find it minefield. One of the mistakes I've made, not that long ago, was to incorrectly assume there was no need to use `&` with `+`. So I avoid using inline assembly, and it's be a long time since I've used it anywhere else than a Stack Overflow answer. – Ross Ridge Aug 26 '17 at 19:14