5

Going through http://hackoftheday.securitytube.net/2013/04/demystifying-execve-shellcode-stack.html

I understood the nasm program which invokes execve and was trying to re-write it.

Some background information:

int execve(const char *filename, char *const argv[], char *const envp[]);

So, eax = 11 (function call number for execve), ebx should point to char* filename, ecx should point to argv[] (which will be the same as ebx since the first argument is the *filename itself e.g. "/bin/sh" in this case), and edx will point to envp[] (null in this case).

Original nasm code:

global _start

section .text
_start:

xor eax, eax
push eax

; PUSH //bin/sh in reverse i.e. hs/nib//

push 0x68732f6e
push 0x69622f2f

mov ebx, esp

push eax
mov edx, esp

push ebx
mov ecx, esp

mov al, 11
int 0x80

The stack is as follows:

enter image description here

Now i tried to optimize this by reducing a few instructions. I agree that till mov ebx, esp the code will remain the same. However, since ecx will need to point to ebx, I can re-write the code as follows:

global _start

section .text
_start:

xor eax, eax
push eax

; PUSH //bin/sh in reverse i.e. hs/nib//

push 0x68732f6e
push 0x69622f2f
mov ebx, esp

mov ecx,ebx

push eax
mov edx, esp

mov al, 11
int 0x80

However, I get a segmentation fault when I run my re-written code.

My stack is as follows: enter image description here

Any ideas why the re-written code does not work? I've ran gdb also and the address values are according to my thinking, but it just won't run.

didierc
  • 14,572
  • 3
  • 32
  • 52
user720694
  • 2,035
  • 6
  • 35
  • 57
  • 1
    Notice: Those assembly codes are strongly machine dependent and there is no guarantee to execute on every random machine. – masoud May 19 '13 at 08:30
  • 1
    @M.M But the original code works just fine. I've tested both of them on the same machine. – user720694 May 19 '13 at 08:32

1 Answers1

4

In both cases ebx is pointing to the string "//bin/sh". The equivalent of C code like this:

char *EBX = "//bin/sh";  

But in your first example, ecx is set to the address of a pointer to that string. The equivalent of C code like this:

char *temp = "//bin/sh"; // push ebx
char **ECX = &temp;      // mov ecx, esp

While in your second example, ecx is just set to the same value as ebx.

char *ECX = "//bin/sh";

The two examples are thus fundamentally different, with ecx have two completely different types and values.

Update:

I should add that technically ecx is an array of char pointers (the argv argument), not just a pointer to a char pointer. You're actually building up a two item array on the stack.

char *argv[2];
argv[1] = NULL;         // push eax, eax being zero
argv[0] = "//bin/sh";   // push ebx
ECX = argv;             // mov ecx,esp

It's just that half of that array is doubling as the envp argument too. Since envp is a single item array with that single item being set to NULL, you can think of the envp arguments being set with C code like this:

EDX = envp = &argv[1];           

This is achieved by setting edx to esp while the argv array is only half constructed. Combining the code for the two assignments together you get this:

char *argv[2];
argv[1] = NULL;         // push eax, eax being zero
EDX = &argv[1];         // mov edx,esp
argv[0] = "//bin/sh";   // push ebx
ECX = argv;             // mov ecx,esp

It's a bit convoluted, but I hope that makes sense to you.

Update 2

All of the arguments to execve are passed as registers, but those registers are pointers to memory which needs to be allocated somewhere - in this case, on the stack. Since the stack builds downwards in memory, the chunks of memory need to be constructed in reverse order.

The memory for the three arguments looks like this:

char *filename:  2f 2f 62 69 | 6e 2f 73 68 | 00 00 00 00 
char *argv[]:    filename    | 00 00 00 00               
char *envp[]:    00 00 00 00   

The filename is constructed like this:

push eax        // '\0' terminator plus some extra
push 0x68732f6e // 'h','s','/','n'
push 0x69622f2f // 'i','b','/','/'

The argv argument like this:

push eax // NULL pointer
push ebx // filename

And the envp argument like this:

push eax // NULL pointer

But as I said, the original example decided to share memory between argv and evp, so there is no need for that last push eax.

I should also note that the reverse order of the characters in the two dwords used when constructing the string is because of the endianess of the machine, not the stack direction.

James Holderness
  • 22,721
  • 2
  • 40
  • 52
  • Why is ECX a double pointer (in the line mov ecx,esp)? Also, I've also observed one thing. Even if i push the same value again, i get a segmentation fault. Only the original code in which first, ebx then edx and then ecx is pushed works. I tried this: mov ebx,esp; push ebx; mov ecx,esp; push eax; mov edx,esp; This too does not work. This led me to think that somehow the order in which values are pushed on the stack matters. – user720694 May 19 '13 at 11:04
  • Your answer looks promising. May i also ask why //bin/sh is pushed in the reverse order? If it really should have been reverse, the initial null byte should be pushed after pushing the reverse of //bin/sh. – user720694 May 19 '13 at 11:10
  • Yeah, i'm trying to understand your answer. However, i see that there are two NULL's pushed in the first code. One is before the //bin/sh (obtained via xor eax,eax) and one is after it (push eax). So ebx = //bin/sh + the first null. The second null is set to edx. – user720694 May 19 '13 at 11:50
  • I hoped that would be clear from my second update. The first NULL is just serving as the null terminator for the string. The second NULL serves both as a NULL pointer in the envp array and as a NULL pointer for the second item in the argv array. Note that edx isn't set to NULL - it's pointing to a piece of memory that contains a NULL. – James Holderness May 19 '13 at 11:55
  • Since argv = filename and filename already ends with a null, why do we need an additional null in argv? – user720694 May 19 '13 at 11:57
  • argv is not equal to the filename. It's pointing to a piece of memory that contains a pointer to the filename (i.e. filename is the first item in the argv array). The second item in the argv array needs to be set to NULL to mark the end of the array. argv is not a string - it's an array of strings. – James Holderness May 19 '13 at 12:00
  • So i think i've understood most of it. I'll accept your answer. I just have a small doubt so bear with me. since we require a double pointer for char* argv[] (ECX), we've provided it with mov ebx,esp. I get that. Why do we need to null-terminate the array? My understanding is that when the interrupt handler de-references ecx, it will get the address of EBX, which points to the string //bin/sh0x00. That's it. Why do we need a null pointer again to mark the termination of the array? (The double pointer itself says that array starts from the location it points to). – user720694 May 19 '13 at 14:28
  • Following the above doubt, since envp[] is a array of pointers, we need to provide edx with a pointer in a similar manner as ecx that points to a null value in memory. So, in C, this would be something like char** EDX = (some address which points to a null value). But here, we are directly assigning char** EDX = null i.e. char* envp = NULL. So if i understand this correctly, no matter how many pointers are nested, if the array of them is initially pointing to null, that means there are no elements in it? – user720694 May 19 '13 at 14:38
  • You've almost got it. You need to NULL terminate the argv array, because it could contain multiple values. When the interrupt dereferences ecx and finds the string, that's just the first item in the array, `argv[0]`. It's then going to look at the next dword in memory (at ecx+4) to get `argv[1]` (there might be another parameter there). When it finds a NULL it knows there is nothing more to come. If there wasn't a NULL and it was some random value, it would try and interpret that as a string pointer and likely segfault. – James Holderness May 19 '13 at 14:52
  • As for the *envp* argument - that works the same way, only the first item in the array is a NULL. You don't set edx to NULL - you set it to esp, which is pointing to an area of memory that contains a NULL (i.e. `evnp[0] = NULL`). You don't need to set any more of the items in the array, because once the interrupt encounters a NULL in the first item, it knows there is nothing more to come. – James Holderness May 19 '13 at 14:56
  • Thanks a lot james for being patient with me and helping me throughout. I've accepted your answer. I drew the entire stack multiple times and then it was crystal clear to me. I was wondering, if we write NULL as 0x00 0x00 0x00 0x00 (since a word is pushed to stack), how do we point something to memory address 0? – user720694 May 19 '13 at 15:22
  • C basically guarantees that no objects are allocated at memory address 0, so you would never need to point to something at that address. If you have a pointer set to 0, by definition that is NULL. – James Holderness May 19 '13 at 16:13