Why does the following assembly sequence signals SIGILL?

Question

These are all valid instructions up until 0x7fffffffdbe4, at which point the the program already calls the exit syscall.

gdb) x/20i $rip
    => 0x7fffffffdbb0:  movabs rax,0x2168732f6e69622f
       0x7fffffffdbba:  push   rax
       0x7fffffffdbbb:  lea    rdi,[rsp]
       0x7fffffffdbbf:  xor    rax,rax
       0x7fffffffdbc2:  mov    BYTE PTR [rdi+0x7],al
       0x7fffffffdbc5:  mov    QWORD PTR [rdi+0x8],rdi
       0x7fffffffdbc9:  mov    BYTE PTR [rdi+0x10],al
       0x7fffffffdbcc:  mov    rsi,QWORD PTR [rdi+0x8]
       0x7fffffffdbd0:  push   rax
       0x7fffffffdbd1:  push   rdi
       0x7fffffffdbd2:  mov    rsi,rsp
       0x7fffffffdbd5:  add    rax,0x3b
       0x7fffffffdbd9:  syscall 
       0x7fffffffdbdb:  add    rax,0x1
       0x7fffffffdbdf:  xor    rdi,rdi
       0x7fffffffdbe2:  syscall 
       0x7fffffffdbe4:  and    DWORD PTR [rcx],esp
       0x7fffffffdbe6:  and    DWORD PTR [rcx],esp
       0x7fffffffdbe8:  mov    al,0xdb
       0x7fffffffdbea:  (bad)

unexpected behavior is seen after the 0x7fffffffdbb1 instruction is called, and that is beyond me to understand.

(gdb) nexti
0x00007fffffffdbba in ?? ()
(gdb) nexti
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x2168732f6e69622f

0x00007fffffffdbbb in ?? ()
(gdb) i r rsp
rsp            0x7fffffffdbe8   0x7fffffffdbe8
(gdb) i r rip
rip            0x7fffffffdbbb   0x7fffffffdbbb
(gdb) nexti
0x00007fffffffdbbf in ?? ()
(gdb) nexti
0x00007fffffffdbc2 in ?? ()
(gdb) nexti
0x00007fffffffdbc5 in ?? ()
(gdb) nexti
0x00007fffffffdbc9 in ?? ()
(gdb) nexti
0x00007fffffffdbcc in ?? ()
(gdb) nexti
0x00007fffffffdbd0 in ?? ()
(gdb) nexti
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x0

0x00007fffffffdbd1 in ?? ()
(gdb) nexti

Program received signal SIGILL, Illegal instruction.
0x00007fffffffdbd9 in ?? ()
(gdb)

I am posting output starting at 0x7fffffffdbba given gdb cannot seem to set a breakpoint at address (the value pushed into the stack) and then at address 0.

SIGILL is not mapped to hardware's #UD (illegal instruction) all the time; there may be other situations when the OS decides to signal SIGILL. — Grigory Rechistov, Mar 29 '18 at 05:42
Also, here's list of hardware exceptions that `push rdi` may generate in the 64-bit mode. #GP(0) - If the memory address is in a non-canonical form; #SS(0) - If the stack address is in a non-canonical form; #PF(fault-code) If a page fault occurs; #AC(0) - If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3; #UD - If the LOCK prefix is used. #UD - If the PUSH is of CS, SS, DS, or ES. Cannot immediately guess which one is the case in your situation however. — Grigory Rechistov, Mar 29 '18 at 05:46
Examine memory at rsp and rip once it gets to dbd1. The stack is encroaching on the code, and has already overwritten one syscall instruction and is about to overwrite the other, so the code clearly is going to fail, but I don’t quite see why it fails at dbd1 instead of at dbd9. — prl, Mar 29 '18 at 06:43
Ah, perhaps the single-step operation puts stuff on the stack, overwriting 16 additional bytes of code. — prl, Mar 29 '18 at 06:45
Out of curiosity. Did you compile the program you are exploiting with this exploit using optimizations turned on (`-O1`, `-O2` or `-O3`)? Wondering if the Linux red zone is accounting for the fact that your stack pointer is starting out higher than the placement of the code on the stack. You might want to add -128 to RSP before the first push. — Michael Petch, Mar 29 '18 at 11:28
You can use `p /x $rsp` to print registers. Or `layout reg`. — Peter Cordes, Mar 30 '18 at 02:39

Alexis Wilke · Answer 1 · 2018-03-30T07:35:52.390

As some people mentioned, you are smashing the code with the stack and your mov to [rdi]. However, the mov happen to rsp - 8 so it should be fine in regard to address0x7fffffffdbd0.

I think that the problem occurs because of that. You should look at the program (x/20i $rip) after each push and mov [rdi+x], ? to see what it becomes. It may be valid code... it may not and SIGILL as a result.

=> 0x7fffffffdbb0:  movabs rax,0x2168732f6e69622f
   0x7fffffffdbba:  push   rax
   0x7fffffffdbbb:  lea    rdi,[rsp]
   0x7fffffffdbbf:  xor    rax,rax
   0x7fffffffdbc2:  mov    BYTE PTR [rdi+0x7],al
   0x7fffffffdbc5:  mov    QWORD PTR [rdi+0x8],rdi
   0x7fffffffdbc9:  mov    BYTE PTR [rdi+0x10],al
   0x7fffffffdbcc:  mov    rsi,QWORD PTR [rdi+0x8]
   0x7fffffffdbd0:  push   rax                    <-- after "push rdi" (0x7fffffffdbd0)
   0x7fffffffdbd1:  push   rdi
   0x7fffffffdbd2:  mov    rsi,rsp
   0x7fffffffdbd5:  add    rax,0x3b
   0x7fffffffdbd9:  syscall                       <-- after 2nd "push rax" (0x7fffffffdbd8)
   0x7fffffffdbdb:  add    rax,0x1
   0x7fffffffdbdf:  xor    rdi,rdi
   0x7fffffffdbe2:  syscall                       <-- after 1st "push rax" (0x7fffffffdbe0)
   0x7fffffffdbe4:  and    DWORD PTR [rcx],esp
   0x7fffffffdbe6:  and    DWORD PTR [rcx],esp    <-- mov [rdi+7] (0x7fffffffdbe7)
   0x7fffffffdbe8:  mov    al,0xdb                <-- stack starts here
   0x7fffffffdbea:  (bad)

Yes, writing valid instructions explains the "delayed" failure. No, it couldn't be using stale i-cache or already-fetched instructions, not on any currently-existing x86 hardware. x86 has i-cache coherent with d-cache. Self-modifying-code in theory (on paper in the x86 manuals) required a `jmp` for instruction fetch/prefetch to "notice" newly-written instructions, but in practice (https://stackoverflow.com/q/17395557) modern CPUs always snoop for stores near the current RIP and flush the pipeline (`machine_nuke.smc`) because that's the fastest way to give *at least* the required semantics. — Peter Cordes, Mar 30 '18 at 04:11
@PeterCordes, ah. I wasn't aware that the x86 would work on keeping the I-cache coherent like this. Backward compatibility I suppose. I removed that part from my answer. — Alexis Wilke, Mar 30 '18 at 07:40
Yes, backwards compat is x86's raison d'être. See that linked answer for more discussion about the lengths vendors go to to avoid breaking existing code in widely used software like Windows, especially @krazyglew's answer and the comment thread on it. (Andy Glew worked on Intel's P6 microarchitecture.) Real CPUs very often have stronger guarantees than the paper architecture. The paper spec is what they might eventually be able to relax to at some point. — Peter Cordes, Mar 30 '18 at 17:36

Why does the following assembly sequence signals SIGILL?

1 Answers1