0

I'm not sure whether this is a bug in the emulator, but I'm going to assume it's more likely that I, the assembly newbie, am mistaken rather than the programming gurus who designed the Unicorn (based on Qemu) engine :P

This is Python code, but my question is really about x86-64 / assembly language, not Python:

# In this code, I simply assemble some assembly code, disassemble
# it for a sanity check, and then execute the assembled binary,
# observing the resulting stack.

# Keystone
# The Ultimate Assembler
from keystone import *

# Capstone
# The Ultimate Disassembler
from capstone import *

# Unicorn
# The ultimate CPU emulator
import unicorn
from unicorn import *
from unicorn.x86_const import *

# globals

KB = 1024
MB = KB * KB
MEM_SIZE = int(MB / 256)
STACK_SIZE = int(MEM_SIZE / 2)
REGS_CACHE = False

# code

ks = Ks(KS_ARCH_X86, KS_MODE_64)
ks.syntax = KS_OPT_SYNTAX_ATT
md = Cs(CS_ARCH_X86, CS_MODE_64)

ASM = b''
ASM += b'PUSH $0x21;'
ASM += b'PUSH $0x22;'
ASM += b'PUSH $0x23;'
ASM += b'PUSH $0xffffffff;'
ASM += b'PUSH $0x1d1d1d1d1d1d1d1d;'
ASM += b'PUSH $0x21;'
ASM += b'MOV $1, %rax;'
ASM += b'MOV $1, %rdi;'
ASM += b'MOV %rsp, %rsi;'
ASM += b'MOV $1, %rdx;'
ASM += b'syscall;'
ASM += b'ADD $8, %rsp;'

try:
   BIN, count = ks.asm(ASM)
except KsError as e:
   print("ERROR: %s" %e)

START_ADDR = 0x0
BIN = bytes(BIN)
BIN_LEN = len(BIN)
END_ADDR = START_ADDR + BIN_LEN

print('Sanity check disassembly:')
for i in md.disasm(BIN, START_ADDR):
    print(f'0x{i.address}\t {i.mnemonic}\t {i.op_str}')
    pass

mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(START_ADDR, MEM_SIZE)
mu.mem_write(START_ADDR, BIN)
# initialize stack
mu.reg_write(UC_X86_REG_RSP, MEM_SIZE - STACK_SIZE)
mu.emu_start(START_ADDR, END_ADDR)
print("Emulation done. Below is the stack:")
memory = mu.mem_read(START_ADDR, MEM_SIZE)
rsp = mu.reg_read(UC_X86_REG_RSP)
stack = memory[rsp:MEM_SIZE]
print('rsp', rsp)
print('mem len:', len(memory))
print(bytes(stack).hex())

Output:

Sanity check dissassembly:
0x0      push    0x21
0x2      push    0x22
0x4      push    0x23
0x6      push    0xffff
0x10     push    0x1d1d
0x14     push    0x21
0x16     movabs  rax, 1
0x26     movabs  rdi, 1
0x36     mov     rsi, rsp
0x39     movabs  rdx, 1
0x49     syscall
0x51     add     rsp, 8
Emulation done. Below is the stack:
rsp 2020
mem len: 4096
1d1dffff230000000000000022000000000000002100000000000000000000000000000000000000000000000...snip...

Before I modified these instructions, it was a simple assembly code example pushing 0x21 (!) to the stack and making a system call to trigger output (logs an ! mark). But I threw some extra pushes in there to play with the stack, expecting each push to add 64 bits to the stack (decrementing RSP by 8 bytes). But I noticed instead that when I pushed the 4 and 8 byte values 0xffffffff and 0x1d1d1d1d1d1d1d1d, only 0xffff and 0x1d1d was appended to the stack, and RSP was decremented by 2 bytes, twice. Pushing 0x21 to the stack again pushed a full 8 bytes. The 0x21 being pushed isn't actually apparent in this log output, but I think that's because the following instructions, maybe the syscall, popped it from the stack. But if those instructions are removed, it is there.

I'm confused about what's going on.

J.Todd
  • 707
  • 1
  • 12
  • 34
  • 1
    Your assembler is broken because it does not complain about invalid instructions. Instead it just emits 16 bit versions. There is no `push imm64` instruction so you can't do `PUSH $0x1d1d1d1d1d1d1d1d`. Depending on what you want, you could do `push $-1` to get `0xffffffffffffffff` on the stack. – Jester Jan 14 '22 at 02:11
  • @Jester I haven't been able to find Googling what the "imm" in "imm64" means. I assume it means basically 64 bit number. So... I can't push 64 bits to the stack? I read from [an answer](https://stackoverflow.com/a/40312390/9400421) "The size of the value pushed on the stack and the amount that the stack pointer is adjusted by depends on the operand size of the PUSH instruction. In 64-bit mode the operand size can only be 16-bit or 64-bit." I read that to mean you can push 64 bit values or 16 bit values. Which would make me think maybe my mistake was the 32 bit values. – J.Todd Jan 14 '22 at 02:18
  • 1
    There is no form of `push` that takes a 64 bit number as an immediate. You can push 64 bits, but only as sign extended or from a register or memory. – Jester Jan 14 '22 at 02:19
  • 1
    @Jester Ah imm == "immediate". And ok, I see. Thanks. I'll post that as an assembler bug for Keystone. – J.Todd Jan 14 '22 at 02:21
  • Related: [How many bytes does the push instruction push onto the stack when I don't specify the operand size?](https://stackoverflow.com/q/45127993) – Peter Cordes Jan 14 '22 at 02:25
  • `syscall` (and the Linux kernel code that's invoked by it) doesn't touch the user-space stack pointer or stack memory. The `0x21` is obviously below the stack pointer because of the `add rsp, 8` that effectively "popped" it into nowhere. (Although on Linux, it's still guaranteed not to have been overwritten by anything, along with the rest of the red zone, 128 bytes below RSP). This should have been obvious if you single-stepped in a debugger with `display /x *(unsigned long)$rsp` or similar to dump the qword at the top of the stack. – Peter Cordes Jan 14 '22 at 02:29
  • @PeterCordes oh yeah, silly me, idk what I was thinking. – J.Todd Jan 14 '22 at 02:31
  • Easy to miss stuff when just looking at the code, even if it's obvious in hindsight. That's why I suggest in future actually doing more interactive playing around to see your code running as you single step and watch registers (and memory) change. Asm is a *very* debuggable language, in terms of debuggers being extremely useful since the program state and steps truly exists directly in memory, and the programming model is about layout out sequences of steps for the machine to take, not higher level semantics. The machine code for a single insn tells you exactly what it does in isolation. – Peter Cordes Jan 14 '22 at 02:36
  • @PeterCordes Yeah to be honest with you, the cyber sec classes I've taken that showed us debuggers for reverse engineering didn't explain much about the commands available, and so I've shied away from them. But now I suppose it's time to learn. Also that answer of yours that you linked as related was extremely informative. You could probably close this as a duplicate of that, it seems close to me. Edit: Nvm I was able to do it. – J.Todd Jan 14 '22 at 02:41
  • @PeterCordes Are you aware of any use-case for offsetting the stack, as you noted was possible, with a 16 bit push in 32 or 64 bit mode? – J.Todd Jan 14 '22 at 02:46
  • 2
    Generally no, except for obscure code-golf hacks like push 16 / push 16 / pop 32 to concatenate two values with less code-size than shift/or, but also much lower performance. Search for the "When would you ever want to push 16 bits?" bolded sentence at the start of a section in the answer I linked for a link to an example of that. The existence of `push imm16` is not because they thought it was useful, moreso because disabling it would have taken a special case for decoding, I think. – Peter Cordes Jan 14 '22 at 03:14
  • I guess another possible use-case in 64-bit mode is construction data (like a string) on the stack; without a free register, you can do `push qword imm32` / `mov dword [rsp+4], imm32` for 8 bytes, and one or more `push imm16` for the last (or first) byte or two, if you can't just `push qword imm8`. But all instructions like `iret` that pop things from the stack even into segment regs have each thing padded to a qword; that would only be again for odd hacks. – Peter Cordes Jan 14 '22 at 03:17
  • Your disassembler is printing addresses in decimal but prefixing them with "0x" anyway. – prl Jan 14 '22 at 07:45
  • @prl ty, good catch. – J.Todd Jan 14 '22 at 14:09

0 Answers0