Shellcode Segmentation Fault error when run from exploitable program

Question

BITS 64
section     .text
global      _start

_start:
jmp short two

one:
pop     rbx
xor     al,al
xor     cx,cx
mov     al,8
mov     cx,0755
int     0x80
xor     al,al
inc     al
xor     bl,bl                               
int     0x80

two:
call one
db  'H'`

This is my assembly code. Then I used two commands. "nasm -f elf64 newdir.s -o newdir.o" and "ld newdir.o -o newdir".I run ./newdir and worked fine but when I extracted op code and tried to test this shellcode using following c program . It is not working(no segmentation fault).I have compiled using cmd gcc newdir -z execstack

#include <stdio.h>
char sh[]="\xeb\x16\x5b\x30\xc0\x66\x31\xc9\xb0\x08\x66\xb9\xf3\x02\xcd\x80\x30\xc0\xfe\xc0\x30\xdb\xcd\x80\xe8\xe5\xff\xff\xff\x48";
void main(int argc, char **argv)
{
    int (*func)();
    func = (int (*)()) sh;
    (int)(*func)();
}

objdump -d newdir

newdir:     file format elf64-x86-64


Disassembly of section .text:

0000000000400080 <_start>:
  400080:   eb 16                   jmp    400098 <two>

0000000000400082 <one>:
  400082:   5b                      pop    %rbx
  400083:   30 c0                   xor    %al,%al
  400085:   66 31 c9                xor    %cx,%cx
  400088:   b0 08                   mov    $0x8,%al
  40008a:   66 b9 f3 02             mov    $0x2f3,%cx
  40008e:   cd 80                   int    $0x80
  400090:   30 c0                   xor    %al,%al
  400092:   fe c0                   inc    %al
  400094:   30 db                   xor    %bl,%bl
  400096:   cd 80                   int    $0x80

0000000000400098 <two>:
  400098:   e8 e5 ff ff ff          callq  400082 <one>
  40009d:   48                      rex.W

when I run ./a.out , I am getting something like in photo. I am attaching photo because I cant explain what is happening.image

P.S- My problem is resolved. But I wanted to know where things was going wrong. So I used debugger and the result is below `

(gdb) list
1   char shellcode[] = "\xeb\x16\x5b\x30\xc0\x66\x31\xc9\xb0\x08\x66\xb9\xf3\x02\xcd\x80\x30\xc0\xfe\xc0\x30\xdb\xcd\x80\xe8\xe5\xff\xff\xff\x48";
2   int main (int argc, char **argv)
3   {
4           int (*ret)();              
5           ret = (int(*)())shellcode; 
6                                      
7           (int)(*ret)();   
8           }           (gdb) disassemble main
Dump of assembler code for function main:
   0x00000000000005fa <+0>: push   %rbp
   0x00000000000005fb <+1>: mov    %rsp,%rbp
   0x00000000000005fe <+4>: sub    $0x20,%rsp
   0x0000000000000602 <+8>: mov    %edi,-0x14(%rbp)
   0x0000000000000605 <+11>:    mov    %rsi,-0x20(%rbp)
   0x0000000000000609 <+15>:    lea    0x200a20(%rip),%rax        # 0x201030 <shellcode>
   0x0000000000000610 <+22>:    mov    %rax,-0x8(%rbp)
   0x0000000000000614 <+26>:    mov    -0x8(%rbp),%rdx
   0x0000000000000618 <+30>:    mov    $0x0,%eax
   0x000000000000061d <+35>:    callq  *%rdx
   0x000000000000061f <+37>:    mov    $0x0,%eax
   0x0000000000000624 <+42>:    leaveq 
   0x0000000000000625 <+43>:    retq   
End of assembler dump.
(gdb) b 7
Breakpoint 1 at 0x614: file test.c, line 7.
(gdb) run
Starting program: /root/Desktop/Progs/shell/a.out 

Breakpoint 1, main (argc=1, argv=0x7fffffffe2b8) at test.c:7
7           (int)(*ret)();   
(gdb) info registers rip
rip            0x555555554614   0x555555554614 <main+26>
(gdb) x/5i $rip
=> 0x555555554614 <main+26>:    mov    -0x8(%rbp),%rdx
   0x555555554618 <main+30>:    mov    $0x0,%eax
   0x55555555461d <main+35>:    callq  *%rdx
   0x55555555461f <main+37>:    mov    $0x0,%eax
   0x555555554624 <main+42>:    leaveq 
(gdb) s
(Control got stuck here, so i pressed ctrl+c)
^C
 Program received signal SIGINT, Interrupt.
 0x0000555555755048 in shellcode ()
(gdb) x/5i 0x0000555555755048 
=> 0x555555755048 <shellcode+24>:   callq  0x555555755032 <shellcode+2>
   0x55555575504d <shellcode+29>:   rex.W add %al,(%rax)
   0x555555755050:  add    %al,(%rax)
   0x555555755052:  add    %al,(%rax)
   0x555555755054:  add    %al,(%rax)

Here is the debugging information. I am not able to find where the control goes wrong.If need more info please ask.

You should be xoring the entire 32-bit registers, not just the lower parts as the full 32-bit value is generally used by the interrupt routines. example: `xor al,al` should be `xor eax,eax`. Do same for all of them. — Michael Petch, Jan 30 '18 at 20:18
You have to use `syscall` instead of `int 0x80` as you'll need 64-bit addresses for the stack ased pointers and that isn't supported by `int 0x80`. If you were making a 32-bit application with `-felf32` and linking with `-melf_i386` to generate a 32-bit executable it should work on OSes that support running of 32-bit executables (Windows Services for Linux doesn't as an example) — Michael Petch, Jan 30 '18 at 20:34
the reason it runs standalone is that the code and data is not being run from the stack, so no stack based pointers and luckily the upper parts of the registers you are using are already zero. When it is run from the stack in any arbitrary program as a shell code your data is now on the stack and the upper parts of the registers you are using may no longer be zero. I suspect the bulk of your problems are related to using 64-bit stack based address via RBX where `int 0x80` only sees the bottom 32-bits in EBX. — Michael Petch, Jan 30 '18 at 20:47
You don't actually have a pictured attached but I can assume it tells us you got a segmentation fault when run. — Michael Petch, Jan 30 '18 at 20:56
PS: You can't just replace `int 0x80` with `syscall`. It uses a different calling convention. [Ryan Chapman's blog](http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/) has good informarion on the use of `syscall` in Linux. — Michael Petch, Jan 31 '18 at 02:00
So if we want to run shellcode on 32-bit system then the assembly should be written in 32 bit format. Am I right? @MichaelPetch — Pushpam Kumar, Jan 31 '18 at 19:41
Correct. To make this 32-bit you'd have to use `[bits 32]` change `pop rbx` to `pop ebx`. Assemble with NASM using `-felf32` instead of `-felf64` and link by adding `-melf_i386` to the LD command line. When compiling the _C_ code you'd have to use `-m32` option. Plus you'll want to make the changes regarding zeroing the 32-bit registers as I mentioned previously. — Michael Petch, Jan 31 '18 at 19:45
As it is your code is closer to 32-bit code than it is to 64-bit code. I can only guess you may have used a 32-bit shellcode tutorial and attempted to use it as 64-bit code. — Michael Petch, Jan 31 '18 at 19:46
Use `si` (`stepi`) to step by asm instructions, not by C source lines, so you can follow the apparently infinite loop your code gets stuck in. — Peter Cordes, Feb 01 '18 at 11:46
If you have a new question please start a new question, don't amend an existing one. — Michael Petch, Feb 01 '18 at 12:48

score 2 · Accepted Answer · answered Jan 31 '18 at 20:34

Below is a working example using x86-64; which could be further optimized for size. That last 0x00 null is ok for the purpose of executing the shellcode.

assemble & link:

$ nasm -felf64 -g -F dwarf pushpam_001.s -o pushpam_001.o && ld pushpam_001.o -o pushpam_001

Code:

BITS 64
section     .text
global      _start

_start:
jmp short two

one:
        pop   rdi               ; pathname

        xor rax, rax
        add al, 85              ; creat syscall 64-bit Linux

        xor rsi, rsi
        add si, 0755            ; mode - octal
        syscall

        xor rax, rax
        add ax, 60
        xor rdi, rdi
        syscall

two:
call one
db  'H',0

objdump:

pushpam_001:     file format elf64-x86-64
0000000000400080 <_start>:
  400080:       eb 1c                   jmp    40009e <two>
0000000000400082 <one>:
  400082:       5f                      pop    rdi
  400083:       48 31 c0                xor    rax,rax
  400086:       04 55                   add    al,0x55
  400088:       48 31 f6                xor    rsi,rsi
  40008b:       66 81 c6 f3 02          add    si,0x2f3
  400090:       0f 05                   syscall
  400092:       48 31 c0                xor    rax,rax
  400095:       66 83 c0 3c             add    ax,0x3c
  400099:       48 31 ff                xor    rdi,rdi
  40009c:       0f 05                   syscall
000000000040009e <two>:
  40009e:       e8 df ff ff ff 48 00               

             .....H.

encoding extraction: There are many other ways to do this.

$ for i in `objdump  -d pushpam_001  | grep "^ " | cut -f2`; do echo -n '\x'$i; done; echo
\xeb\x1c\x5f\x48\x31\xc0\x04\x55\x48\x31\xf6\x66\x81\xc6\xf3\x02\x0f\x05\x48\x31\xc0\x66\x83\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xdf\xff\xff\xff\x48\x00\x.....H.

C shellcode.c - partial

...
unsigned char code[] = \
"\xeb\x1c\x5f\x48\x31\xc0\x04\x55\x48\x31\xf6\x66\x81\xc6\xf3\x02\x0f\x05\x48\x31\xc0\x66\x83\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xdf\xff\xff\xff\x48\x00";  
...

final:

./shellcode 

--wxrw---t 1 david david     0 Jan 31 12:25 H

With 64-bit code you can avoid a full jmp/call/pop method if your data is 127 or fewer bytes. You can start with a JMP that skips over the data (you have to grammatically 0 the end of the string), followed by the data itself followed by the code.You can then use RIP addressing and LEA to get the address of the data item called `two`.You can also save a few bytes by changing `xor` with 64-bit registers with their 32-bit counterparts. When the destination of an operation is a 32-bit register the processor automatically zero extends it into the upper 32-bits of the associated 64-bit register. — Michael Petch, Jan 31 '18 at 21:01
Correct. I assume the OP is going through some of the same kind of material that I went through that covers basic jmp-call-pop, REL, and stack methods including null reduction but less emphasis on optimizing size; which requires extra study of opcodes, ELF, etc. that many modern books and classes don't provide. — InfinitelyManic, Jan 31 '18 at 21:14
My auto correct put `grammatically` where it should have said `programatically` lol — Michael Petch, Jan 31 '18 at 21:15

score 2 · Answer 2 · answered Jan 31 '18 at 23:27

If int 0x80 in 64-bit code was the only problem, building your C test with gcc -fno-pie -no-pie would have worked, because then char sh[] would be in the low 32 bits of virtual address space, so system calls that truncate pointers to 32 bits would still work.

Run your program under strace to see what system calls it actually makes. (Except that strace decodes int 0x80 syscalls incorrectly in 64-bit code, decoding as if you'd used the 64-bit syscall ABI. The call numbers and arg registers are different.) But at least you can see the system-call return values (which will be -EFAULT for 32-bit creat with a truncated 64-bit pointer.)

You can also just gdb to single-step and check the system call return values. Having strace decode the system-call inputs is really nice, though, so I'd recommend porting your code to use the 64-bit ABI, and then it would just work.

Also, it would actually be able to exploit 64-bit processes where the buffer overflow is in memory at an address outside the low 32 bits. (e.g. like the stack). So yes, you should really stop using int 0x80 or stick to 32-bit code.

You're also depending on registers being zeroed before your code runs, like they are on process startup, but not when called from anywhere else.

xor al,al before mov al,8 is completely pointless, because xor-zeroing al doesn't clear upper bytes. Writing 32-bit registers clears the upper 32, but not writing 8 or 16 bit registers. And if it did, you wouldn't need the xor-zeroing before using mov which is also write-only.

If you want to set RAX=8 without any zero bytes in the machine code, you can

push 8 / pop rax (3 bytes)
xor eax,eax / mov al,8 (4 bytes)
Or given a zeroed rcx register, lea eax, [rcx+8] (3 bytes)

Setting CX to 0755 isn't so simple, because the constant doesn't fit in an imm8. Your 16-bit mov is a good choice (or would have been if you'd zeroed rcx first.

xor  ecx,ecx
lea  eax, [rcx+8]   ; SYS_creat = 8 from unistd_32.h
mov  cx, 0755       ; mode
int  0x80           ; invoke 32-bit ABI

xor  ebx,ebx
lea  eax, [rbx+1]   ; SYS_exit = 1
int  0x80

Presuming that stack alignment ( re push 8 / pop rax (3 bytes)) is moot in the context of a some small shellcode that isn't detrimentally relaying on stack boundaries? — InfinitelyManic, Feb 01 '18 at 06:33
@InfinitelyManic: what? `push 8` / `pop rax` works even if RSP is odd, and it's an exploit so presumably we don't care about trashing the red zone. It can only fault if RSP isn't pointing to writeable memory, but that's highly unlikely if we reached out exploit via a `ret` or other typical attack vectors that overwrite return addresses. — Peter Cordes, Feb 01 '18 at 07:24
Thank u , I hv added 1 more question in original question in P.S section. — Pushpam Kumar, Feb 01 '18 at 11:15

Shellcode Segmentation Fault error when run from exploitable program

2 Answers2