Very simple inline assembly program gives segmentation fault:11 on gcc-5.0

Question

#include <stdio.h>

int square (int n) {
  __asm__("mov %eax, n"
      "mul %eax");
}

int main(void) {
  printf("\nSquare of 4 is %i", square(4));
  /* Calling square gives Segmentation fault: 11 error */
  return 0;
}

When I compile this code on an iMac (Core 2 Duo) with Mac OS X 10.7 & gcc-5.0.0: gcc -o assem -DDEBUG=9 -ansi -pedantic -Wall -g assem.c it's compiled with a warning:

assem.c: In function ‘square’:
assem.c:6:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
assem.c:4:Can't relocate expression. Absolute 0 assumed.

Compilation finished at Mon Jul 25 18:23:47

When I run it it gives Segmentation fault: 11

How to fix it?

Note: I've browsed about 10 questions about Segmentation fault: 11, assembly and inline-assembly none of them helped.

Update

When I change the inline-assembly to: asm ("imul %0, %0" : "+r"(n)); return n; The compiler gives this error:

assem.c: In function ‘square’:
assem.c:4:1: warning: implicit declaration of function ‘asm’ [-Wimplicit-function-declaration]
 asm ("imul %0, %0" : "+r"(n)); 
 ^
assem.c:4:20: error: expected ‘)’ before ‘:’ token
 asm ("imul %0, %0" : "+r"(n)); 
                    ^
assem.c:7:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^

Compilation exited abnormally with code 1 at Mon Jul 25 18:46:49

When I change the assembly to asm ("imul %0, %0" : "+r"(n)); the compiler gave a similar error as above.

Update 2 (25.Jul.2022)

In an attempt to solve the issue without radically changing the square function, I've copied part of the code from Peter Cordes's comment with a clang version of it:

#include <stdio.h>

int quadrat (int n) {
  asm { mov eax, n / imul eax,eax / mov n, eax / };
}

int main(void) {
  printf("\nSquare of 4 is %i\n", 4*4);
  return 0;
}

I did have the clang-3.7.1 on my Mac OS X:

>clang -v
clang version 3.7.1 (tags/RELEASE_371/final)
Target: x86_64-apple-darwin11.4.2
Thread model: posix

I've tried to compile it using:

clang -fasm-blocks ass-clang.c

Note: I normally don't ever use clang

The code didn't compile:


ass-clang.c:4:20: error: unexpected token in argument list
  asm { mov eax, n / imul eax,eax / mov n, eax / };

Update #3 (Specific to the bountied question)

How to fix this code

int square (int n) {
  __asm__("mov %eax, n"
      "mul %eax");
}

without altering its basic structure? That is, the n will be moved to eax (or to any other register, if that's necessary) then that register's value will be multiplied by itself, preferably using the mul command and finally the result will be returned preferably without using the return command. In other words, I need a fix to the code, not a rewrite. For instance I consider this to be rewrite:

asm ("imul %0, %0" : "+r"(n)); return n;

Besides, this rewrite is not intuitive. What's that : ? What's that "+r" doing there, is it assigning the Unix read permissions :)

This is not how gcc-style inline assembly works. Read [the manual](https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html) before attempting to use it. — fuz, Jul 25 '22 at 15:29
In your specific case, you'll need something like `asm ("imul %0, %0" : "+r"(n)); return n;` — fuz, Jul 25 '22 at 15:30
@ScottHunter no, that's not the problem here. Look at [this](https://www.godbolt.org/z/EEnvafKcf) — Jabberwocky, Jul 25 '22 at 15:31
@Jabberwocky: Even if it isn't the cause of the segfault, it is definitely a problem. — Scott Hunter, Jul 25 '22 at 15:33
@fuz I've browsed the manual a bit but it didn't help much. For instance, in the manual it seems to say that the C function argument variables be inside parenthesis, like `(n)` when I do it it gives a compiler error. — Lars Malmsteen, Jul 25 '22 at 15:38
@ScottHunter no it isn't. In x86 gcc the return value of functions is in `eax`, therefore there is no need for an explicit `return` statement. — Jabberwocky, Jul 25 '22 at 15:40
@ScottHunter How to specify the ret value from the inline-assembly? Btw, I've fixed a very minor typo: The `square(33)`should have been `square(4)`, — Lars Malmsteen, Jul 25 '22 at 15:42
@LarsMalmsteen the 3rd comment is the answer you're looking for — Jabberwocky, Jul 25 '22 at 15:44
@Jabberwocky That's exactly what 's written in the "C from scratch" book which I got this code sample from. There the author says the content of the `eax` is the return value. — Lars Malmsteen, Jul 25 '22 at 15:45
@LarsMalmsteen You want `asm ("imul %0, %0" : "+r"(n)); return n;`. Or even just `asm ("imul %0, %0" : "+r"(n));` because the return value is already in eax. — Jabberwocky, Jul 25 '22 at 15:47
@Jabberwocky I've tried the suggested assembly codes. They didn't even get compiled. See the update. — Lars Malmsteen, Jul 25 '22 at 15:56
@Jabberwocky Omitting the return statement leads to broken code. What if the function is inlined? What if the compiler overwrites eax afterwards? You must not assume that registers keep any defined value between asm statements. — fuz, Jul 25 '22 at 16:21
@LarsMalmsteen If what you have there is what the book you are reading says, then the book is either wrong or it is meant for MSVC-style inline assembly which works very different from gcc-style inline assembly (and has different syntax, too, which you presumably edited until it compiled without an understanding as to what you are doing). — fuz, Jul 25 '22 at 16:29
@LarsMalmsteen As for “variables need to be in parantheses,” no, they don't. It seems like you have merely skimmed the manual without actually reading it. The key thing is that you need to declare the operands to an inline assembly statement. You cannot refer to C variables in the statement itself, only to operands you declared. Read the whole thing again, this time more carefully. — fuz, Jul 25 '22 at 16:31
What does have parantheses is the operand declaration syntax `"..."(...)`, but the thing in parantheses can be any expression, not just a variable. And one you have declared an operand, you do not have to put it into parantheses when using it in the assembly; indeed, parantheses in assembly have a different use; they are used to denote index registers. — fuz, Jul 25 '22 at 16:37
`-ansi` is like `-std=c89` or something, not `gnu89`, so the namespace is kept clean. Only `__asm__` versions of names are defined, not `asm`. `-ansi` is a very old version of the C standard; especially in code that uses GNU extensions like inline assembly, you should use the default which is `-std=gnu11` in recent GCC. — Peter Cordes, Jul 25 '22 at 17:52
@PeterCordes When I changed the compiler option `std` from `ansi` to `gnu11` the modified `square` function (which introduced the `imul` command) ran without an error. However I'd like to make the `square` function run in its original, unmodified form. Is that possible? — Lars Malmsteen, Jul 25 '22 at 21:01
What, the form shown in the question `__asm__("mov %eax, n" "mul %eax");`, with [GNU C Basic asm](https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended) (no constraints) in a function that isn't `__attribute__((naked))` (where you'd need a `ret`, and to handle the calling convention manually)? No, zero chance for that to be safe. Also no way for that to access non-global variables at all, and no way to *safely* access local vars. Also, AT&T syntax has the destination on the right, so unless you want to store garbage from EAX into `n`, also no. — Peter Cordes, Jul 25 '22 at 21:07
Clang supports a `-fasm-blocks` option which would let you write `asm { mov eax, n` / `imul eax,eax` / `mov n, eax` / `}`, then you could `return n;`. But if you just want to leave a value in EAX and return, you need to write the whole function in asm, e.g. with `__attribute__((naked))`, including a `ret` instruction. Or in a separate `.s` file, or in a global scope `asm("square: mov %edi, %eax; imul %eax,%eax; ret");` statement with a separate prototype to tell the compiler about it. — Peter Cordes, Jul 25 '22 at 21:09
@PeterCordes I've tried the `clang`style assembly code you've given, tried to compile it with `clang-3.7.1` which is the only clang compiler I know of in my Mac, and it didn't compile. See the Update 2. — Lars Malmsteen, Jul 25 '22 at 21:35
The / characters in my comment were supposed to be line separators. That's why they're not inside the code formatting. See [Is there any way to complie a microsoft style inline-assembly code on a linux platform?](https://stackoverflow.com/q/57186687) for an actual example of what it looks like. — Peter Cordes, Jul 25 '22 at 21:38

score 4 · Answer 1 · edited Jul 28 '22 at 04:13

preferably without using the return command.

If you want your C code to return a value, it's going to require using return. That's just how C works. That's why Peter was suggesting doing, with clang -fasm-blocks to allow MSVC's style of inline asm. This compiles with with clang on Godbolt, to inefficient asm that stores n to the stack so it can be a memory operand in the asm block (because this style of inline asm requires that inefficiency).

int square(int n)
{
    asm { 
    mov eax, n 
    imul eax,eax 
    mov n, eax }
    
    return n;
}

If you were writing this code in pure asm, the 'eax' register is used to hold the return value from a function. So you can 'cheat' your way into not using the C return statement, but only if you tell the compiler that you'll be handling the act of returning from the function yourself. That's what Peter was talking about when he suggested using __attribute__((naked)).

This attribute informs the compiler that it should assume that everything is being handled 100% by assembler code in the function (including the return value and actually returning). That's what he was talking about when he said you could omit the C return statement and just use an x86 ret instruction once you've populated eax. If you mark a function as naked, you cannot use a return statement. The function must only contain asm, not C code (see the docs). The compiler keeps its hands off entirely, so you can be sure that RSP is pointing at a return address when your asm statement or block gets control. But it also means you can't use named local variables or arguments in your asm, so you have to implement the calling convention yourself.

In essence naked means you are writing a function in assembly, but using the C compiler to give it a name and prototype, and to make it part of your C source file. Not an approach I'd recommend. If you want to write an assembly function, write an assembly function in assembly and link it to your C code. Trying to cram the two together usually just results in confusion.

But naked would allow you to omit the return statement and use a ret instruction, if for some reason that's essential.

What's that : ?

If you've read the docs (you have read the docs, right?), you what have seen:

The first colon delimits the 'output' constraints.
The second colon delimits the 'input' constraints.
The third colon delimits the 'clobbers'.

The fact that @fuz only uses one colon means that he has no input-only constraints and uses no clobbers.

What's that "+r" doing there

Looking at the docs, "r" means "move the value into a register before invoking the asm instructions." And the + means that the value is being updated (as opposed to = which would mean written-but-not-read).

the n will be moved to eax (or to any other register, if that's necessary)

By writing "+r"(n), we've already moved the value from n into a register, and no longer need to include a mov instruction in the template. Since n may already be in a register from earlier code, this probably saves doing an extra mov instruction.

Which register will it pick? Since we didn't specify, the compiler will pick whichever one is most efficient. Since we don't know which one that will be (and it may change from compile to compile), we use %0 to refer to the first constraint, %1 for the second (if we had one), etc.

preferably using the mul command

Well, there's a couple problems with that. The mul instruction requires that you use the eax register. Forcing the compiler to explicitly use eax might generate less efficient code if it's using it for something else.

But more importantly, mul uses 2 registers. When you multiply two registers, the largest possible result is twice that wide. mul is a widening multiply that puts the output in edx:eax.

Your original code makes no provision for values that big. But you can't just ignore the fact that the edx register is getting changed. If you don't tell the compiler that you're changing the contents of that register, it's not going to know. How can it not know? After all, the 'mul' instruction is right there, right? From the docs:

GCC does not parse the assembler instructions themselves and does not know what they mean or even whether they are valid assembler input.

Which means if you alter a register without letting the compiler know (via output constraints or clobbers), you can get a big mess if the compiler was using it for something else. By contrast, imul with more than one operand only outputs a single register, which is more consistent with your original code. (Non-widening mul isn't needed because the low half of a full multiply is the same whether the inputs are treated as signed or unsigned.)

For instance I consider this to be rewrite:

I'm not sure I agree.

You want to move the value into a register, "+r" does that for you. You can use mul vs imul if you insist, but you're going to be forcing the compiler to free up the edx register for you in addition to eax. And to run a slower instruction (more uops to write a second register). Why make your code less efficient?

The choice is between "clobbering" the edx register or using imul. Which is the smaller re-write?

Besides, this rewrite is not intuitive.

Ok, now there you've got me.
Writing inline asm is hard (which is why I recommend that you don't do it).

But this "extended asm" approach is how gcc, clang, intel, etc all do it. Microsoft (not surprisingly) went a different way. But their solution turned out to be so hard (or implemented unmaintainably in their compiler), they decided not to support inline asm for 64 bit code. At all. So if you insist on writing inline asm (the most complex way to use assembly language with C), you'll need to learn how it works.

Fuz's approach generates the most efficient code. It also allows the code to be inlined, which most of the other approaches described here do not. It uses the minimum number of registers, leaving these precious resources available for the compiler to use for other purposes.

In summary, I don't know how you're going to find a cleaner, most efficient inline asm solution than this:

int square (int n) {
    asm ("imul %0, %0" : "+r"(n)); 
    return n;
}

Compare how it compiles to how the MSVC-style inline asm version compiles, on Godbolt with clang 14:

# this version, asm ("imul %0, %0" : "+r"(n));
square_gnu(int):
        mov     eax, edi
        imul    eax, eax
        ret

# asm { ... }  version that needs clang -fasm-blocks
# clang -O3 -fasm-blocks
square_msvc(int):
        mov     dword ptr [rsp - 4], edi     # compiler-generated store, into the red-zone

        mov     eax, dword ptr [rsp - 4]     # mov eax, n
        imul    eax, eax                     # square n
        mov     dword ptr [rsp - 4], eax     # mov n, eax

        mov     eax, dword ptr [rsp - 4]     # compiler-generated reload
        ret

BTW, MSVC documents support for leaving a value in EAX at the end of an asm{} block, and then falling off the end of the function without a C return statement. (This works in MSVC even when inlining a function containing an asm{} block. (Presumably enough programmers abused a "happens to work" that MS made it official). This reduces the inefficiency of getting a result out of an asm{} block, but doesn't help with the store/reload to get a value in.

But in clang -fasm-blocks, it only happens to work by chance, breaking when inlined. See Does __asm{}; return the value of eax?

Very simple inline assembly program gives segmentation fault:11 on gcc-5.0

1 Answers1