10

I've noticed a really weird behavior when I was playing with libc's system() function on x86-64 linux, sometimes the call to system() fails with a segmentation fault, here's what I got after debugging it with gdb.

I've noticed that the segmentation fault is cased in this line:

=> 0x7ffff7a332f6 <do_system+1094>: movaps XMMWORD PTR [rsp+0x40],xmm0

According to the manual, this is the cause of the SIGSEGV:

When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) is generated.

Looking deeper down, I've noticed that indeed my rsp value was not 16 byte padded (that is, its hex representation didn't end with 0). Manually modifying the rsp right before the call to system actually makes everything work.

So I've written the following program:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    register long long int sp asm ("rsp");
    printf("%llx\n", sp);

    if (sp & 0x8) /* == 0x8*/
    { 
        printf("running system...\n");
        system("touch hi");
    } 

    return 0;
}

Compiled with gcc 7.3.0 And sure enough, when observing the output:

sha@sha-desktop:~/Desktop/tda$ ltrace -f ./o_sample2
[pid 26770] printf("%llx\n", 0x7ffe3eabe6c87ffe3eabe6c8
)                                           = 13
[pid 26770] puts("running system..."running system...
)                                                  = 18
[pid 26770] system("touch hi" <no return ...>
[pid 26771] --- SIGSEGV (Segmentation fault) ---
[pid 26771] +++ killed by SIGSEGV +++
[pid 26770] --- SIGCHLD (Child exited) ---
[pid 26770] <... system resumed> )           = 139
[pid 26770] +++ exited (status 0) +++

So with this program, I cannot execute system() what so ever.

Small thing also, and I cannot tell if its relevant to the problem, almost all of my runs end up with a bad rsp value and a child that is killed by SEGSEGV.

This makes me wonder a few things:

  1. Why does system mess around with the xmms registers?
  2. Is it a normal behavior? or maybe I'm missing something elementary in regards to how to use the system() function properly?

Thanks in advance

shaqed
  • 342
  • 3
  • 17
  • 1
    The 16 (or 32-byte) alignment is a requirement of the X86-64 System V ABI. It requires the proper alignment The _C_ library often uses XMM registers and related instructions for performance reasons. It is normal behavior. Using `register long long int sp asm ("rsp");` and accessing it as a variable without extended inline assembly is behavior that isn't defined. It is only lucky it works. I'd like to see the original code that apparently fails. Can you show us the original code where you call system and it fails? This smells like an XY problem – Michael Petch Jan 27 '19 at 21:59
  • @MichaelPetch Actually the "original code" is when I was trying some return-oriented-programming as I was trying to override the return address of some function to be `system`'s – shaqed Jan 27 '19 at 22:23
  • 3
    Then it is that code which is the problem. The compiler retains the alignment by itself. Your ROB code is what is at fault. It needs to maintain that alignment before calling `system` per the x86-64 ABI. As you suggested rounding RSP down to the nearest 16-byte boundary works. I assume you do it with something like `and rsp, -16` ? The ABI states that at the point of a function call the stack pointer needs to be 16 byte (some cases 32-byte) aligned. And there is nothing wrong with functions like `system` using aligned vector instruction to improve performance (that is normal). – Michael Petch Jan 27 '19 at 22:28
  • @MichaelPetch So the reason my code's `rsp` isn't aligned is because of the `asm` ? I thought it's a known directive to `gcc` and besides the fact that it compiles my program, shouldn't it also not "mess up" my `rsp` ? – shaqed Jan 27 '19 at 22:43
  • 1
    Programs can compile but that doesn't mean they';ll run properly. Yes your assembly is the cause of the problems. GCC maintains alignment of its own code to ensure the alignment requirement are met. It has no knowledge of any ROB code running.It is up to the ROB code to ensure alignment is maintained before calling `system`If whatever you do in the ROB code misaligns the stack, it is your job to align it. – Michael Petch Jan 27 '19 at 22:52
  • @MichaelPetch: I don't think `int foo asm("rsp")` is supposed to be able to cause your code to miscompile, especially if you never assign to that variable. If gcc allows itself to generate wrong code from this without a warning, that's a bug IMO. The manual says the only supported use is for making `"r"(var)` constraints pick the register you want (https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html), but it doesn't imply that simply using such a variable other times can be problematic. – Peter Cordes Jan 28 '19 at 08:47
  • Oh, maybe I'm wrong: I can't reproduce the OP's results on GCC8.2 -O0 or -O3, or clang7.0 -O0 on Arch Linux. But I *can* get `clang -O3` to print a value that's only aligned by 8, not 16. But that's because it calls `printf` without modifying `rsi` after entry to `main` (so it prints `argv`), and it optimized away the block that calls to `system()`. https://godbolt.org/z/1KJjpa. – Peter Cordes Jan 28 '19 at 08:48
  • @PeterCordes :I upvoted your answer, but if you read back in the comments the OP gave this important tidbit _Actually the "original code" is when I was trying some return-oriented-programming as I was trying to override the return address of some function to be system's_ . What is going on is not in compiler generated code, but in an exploit (ROP) being introduced into the code. – Michael Petch Jan 29 '19 at 03:46
  • @MichaelPetch: Then the question itself needs to say so. As it stands, it reads like it's claiming the OP compiled that program with gcc7.3 and ran it under ltrace to get that output. I did see some mention of ROP in comments while writing my answer, so I explained things in a way that usefully answers what the OP should have asked. But that doesn't make it ok to ask questions with a misleading [mcve] that isn't actually what you did to get the output you posted. If there had been a real MCVE, you and I wouldn't have wasted time wondering if `asm("rsp")` could make gcc emit bad code. – Peter Cordes Jan 29 '19 at 03:59

1 Answers1

6

The x86-64 System V ABI guarantees 16-byte stack alignment before a call, so libc system is allowed to take advantage of that for 16-byte aligned loads/stores. If you break the ABI, it's your problem if things crash.

On entry to a function, after a call has pushed a return address, RSP+-8 is 16-byte aligned, and one more push will set you up to call another function.

GCC of course normally has no problem doing this, by using either an odd number of pushes or using a sub rsp, 16*n + 8 to reserve stack space. Using a register-asm local variable with asm("rsp") doesn't break this, as long as you only read the variable, not assign to it.

You say you're using GCC7.3. I put your code on the Godbolt compiler explorer and compiled it with -O3, -O2, -O1, and -O0. It follows the ABI at all optimization levels, making a main that starts with sub rsp, 8 and doesn't modify RSP inside the function (except for call), until the end of the function.

So does every other version and optimization level of clang and gcc I checked.

This is gcc7.3 -O3's code-gen: note that it does not do anything to RSP except read it inside the function body, so if main is called with a valid RSP (16-byte aligned - 8), all of main's function calls will also be made with 16-byte aligned RSP. (And it will never find sp & 8 true, so it will never call system in the first place.)

# gcc7.3 -O3
main:
        sub     rsp, 8
        xor     eax, eax
        mov     edi, OFFSET FLAT:.LC0
        mov     rsi, rsp          # read RSP.
        call    printf
        test    spl, 8            # low 8 bits of RSP
        je      .L2
        mov     edi, OFFSET FLAT:.LC1
        call    puts
        mov     edi, OFFSET FLAT:.LC2
        call    system
.L2:
        xor     eax, eax
        add     rsp, 8
        ret

If you're calling main in some non-standard way, you're violating the ABI. And you don't explain it in the question, so this is not a MCVE.

As I explained in Does the C++ standard allow for an uninitialized bool to crash a program?, compilers are allowed to emit code that takes advantage of any guarantees the ABI of the target platform makes. This includes using movaps for 16-byte loads/stores to copy stuff around on the stack, taking advantage of the incoming alignment guarantee.


It's a missed optimization that gcc doesn't optimize away the if() entirely, like clang does.

But clang's really treating it as an uninitialized variable; without using it in an asm statement, so the register-local asm("rsp") is not having any effect for clang, I think. Clang leaves RSI unmodified before the first printf call, so clang's main actually prints argv, never reading RSP at all.

Clang is allowed to do this: the only supported use for register-asm local vars is making "r"(var) extended-asm constraints pick the register you want. (https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html).

The manual doesn't imply that simply using such a variable other times can be problematic, so I think this code should be safe in general according to the written rules, as well as happening to work in practice.

The manual does say that using a call-clobbered register (like "rcx" on x86) would lead to the variable being clobbered by function calls, so perhaps a variable using rsp would be affected by compiler-generated push/pop?

This is an interesting test-case: see it on the Godbolt link.

// gcc won't compile this: "error: unable to find a register to spill"
// clang simply copies the value back out of RDX before idiv
int sink;
int divide(int a, int b) {
    register long long int dx asm ("rdx") = b;
    asm("" : "+r"(dx));  // actually make the compiler put the value in RDX

    sink = a/b;   // IDIV uses EDX as an input

    return dx;
}

Without the asm("" : "+r"(dx));, gcc compiles it just fine, never putting b into RDX at all.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847