Passing an array (argv) to a syscall in assembly x64

Question

I am learning to create shellcode and having a great time. I mostly understand what to do. I can create asm code that will actually generate the shell. However, I was going to verify my ability by trying another syscall, namely cat .

I am using the method of building the stack from the registers. However, I am running into an issue where I need to pass an array to the 'argv' parameter. This is simple enough when doing a shell, I can just pass the address of the address of the /bin/sh string on the stack. But with cat I need to pass both the name of the function /bin/cat and the argument for cat ie /etc/issue.

I know that the layout for a syscall is:

rax : syscall ID
rdi : arg0
rsi : arg1
rdx : arg2
r10 : arg3
r8 : arg4
r9 : arg5

What I can't decipher is how to pass {"cat","/etc/issue"} into a single register, namely rsi.

My assembly:

global _start
section .text
_start:
;third argument
xor rdx,rdx

;second array member
xor rbx,rbx
push rbx ;null terminator for upcoming string
;push string in 2 parts
mov rbx,6374652f ;python '/etc/issue'[::-1].encode().hex()
push rbx
xor rbx,rbx
mov rbx, 0x65757373692f
push rbx

;first array member
xor rcx,rcx ;null terminator for upcoming string
add rcx,0x746163 ;python 'cat'[::-1].encode().hex()
push rcx

;first argument
xor rdi,rdi
push rdi ;null terminator for upcoming string
add rdi,7461632f6e69622f ;python '/bin/cat'[::-1].encode().hex()
push rdi
mov rdi,rsp

;execve syscall
xor rax,rax
add rax,59

;exit call
xor rdi,rdi
xor rax,rax
add rax,60

It runs but (as expected) aborts when a NULL is passed as argv.

I even tried just writing a C app that creates an array and quits and debugged that but I still didn't really understand what it was doing to create the array.

I see you pushing things to the stack and messing with registers, but no actual `call` or `syscall` to be seen. — Joseph Sible-Reinstate Monica, May 05 '20 at 04:57
`add rdi,7461632f6e69622f` - you forgot the leading `0x` for NASM hex constants. Also, only `mov` can use a 64-bit immediate, so do that instead of adding to 0. Also you forgot the `syscall` instruction to actually call into the kernel with RAX=59. — Peter Cordes, May 05 '20 at 06:39
Also, NASM doesn't suck, unlike some assemblers, so you can just do `mov rdi, '/etc/iss'` and store / push that to get characters in that order in memory. — Peter Cordes, May 05 '20 at 06:46
I had to re-type the thing, I have those calls in my original code (on a vm) but I can't cut paste from it because....reasons. — twodox, May 05 '20 at 15:53

Joseph Sible-Reinstate Monica · Accepted Answer · 2020-05-05T06:08:41.573

2

You're making this way more complicated than you need to. Here's all you need to do:

    jmp .afterdata
.pathname:
    db '/bin/' ; note lack of null terminator
.argv0:
    db 'cat'
.endargv0:
    db 1 ; we'll have to change the last byte to a null manually
.argv1:
    db '/etc/issue'
.endargv1:
    db 1 ; we'll have to change the last byte to a null manually
.afterdata:
    xor eax, eax ; the null terminator for argv and envp
    push rax
    mov rdx, rsp ; rdx = envp
    dec byte [rel .endargv1] ; change our 1 byte to a null byte
    lea rax, [rel .argv1]
    push rax
    dec byte [rel .endargv0] ; change our 1 byte to a null byte
    lea rax, [rel .argv0]
    push rax
    mov rsi, rsp ; rsi = argv
    lea rdi, [rel .pathname]
    xor eax, eax
    mov al, 59 ; SYS_execve
    syscall
    ; if you wanted to do an exit in case the execve fails, you could here, but for shellcode I don't see the point

You don't need to do any hex-encoding or reversing of strings by hand. You can just stick the strings you need right at the end of your shellcode, and push their addresses onto the stack with rip-relative addressing. The only hoops we jump through are making sure the data is before the instructions that use it, so there's no null bytes there, and having to add in the null terminators on the string at runtime.

Also, you generally want shellcode to be short. Notice how I point into the cat that's part of /bin/cat instead of having it an extra time, and reuse the null at the end of argv for envp.

By the way, if you want to try this as a standalone program, you'll need to pass -Wl,-N and -static to GCC, since the bytes it's modifying will be in the .text section (which is normally read-only). This won't be a problem when you're actually using it as shellcode, since it'll still be writable by whatever means you got it into memory in the first place.

edited May 05 '20 at 06:08

answered May 05 '20 at 05:48

Joseph Sible-Reinstate Monica

45,431
5
48
98

1

Shellcode needs to avoid `00` bytes in the machine code. Jump forward over your strings so the `RIP + rel32` addressing mode for LEA will use `0xFFFFFF??` instead of `0x000000??`. Or see [Avoiding 0xFF bytes in shellcode using CALL to read RIP?](https://stackoverflow.com/q/55778839) for biasing the offset and undoing it. – Peter Cordes May 05 '20 at 05:50
@PeterCordes Good point. I was just thinking about how to get it to run, not how to get it into the target memory. Edit coming up. – Joseph Sible-Reinstate Monica May 05 '20 at 05:55
I was reading through your code and had a couple questions: 1. I don't see ```.endargv0/1``` being moved to a register or the stack. Is that because when you load the effective address of the string it reads forward to the null? 2. I don't see where you put path and argv0 together. Is this another instance of allowing the string to read the stack until it gets to the null after argv0? – twodox May 05 '20 at 16:03
Also, how does the function know that there are 2 parts to the array? Is it because it runs into a double null? – twodox May 05 '20 at 16:21
1

@twodox In assembly, labels don't separate things, so there's no need to put them back together. And I'm not moving any of the strings at any point; I'm just creating pointers to them. And there's no "double null"; the arrays just end when they get to **a** null. – Joseph Sible-Reinstate Monica May 05 '20 at 16:31
@JosephSible-ReinstateMonica In that case, why doesn't the array end when it finds the null terminator for the first string (namely endargv0)? Also, I copied your code over to watch in edb-debugger what its doing to better understand. It gets a Segmentation Fault when it reaches ```dec byte [rel .endargvX]```. I put your code directly after ```_start:```. I also haven't injected it yet, just ran ```nasm -felf64 [file]``` and ```ld [file]```. I'm working from a book and they made a couple modifications to the linux virtual machine for the randomize stack layout iirc. – twodox May 05 '20 at 17:23
1

@twodox The segfault is because you didn't read the end of my answer: "if you want to try this as a standalone program, you'll need to pass `-Wl,-N` and `-static` to GCC", or since you're using ld directly, pass `-N` and `-static` to it. – Joseph Sible-Reinstate Monica May 05 '20 at 17:27
1

@twodox You're mixing up two different ends. A null byte ends a string, and a null pointer ends an array of strings. – Joseph Sible-Reinstate Monica May 05 '20 at 17:28

Passing an array (argv) to a syscall in assembly x64

1 Answers1