2

My setup

  • OS: ------------------ Linux
  • Architecture: ------ x86-64
  • Syntax: ------------ AT&T
  • Compiler: --------- GAS

My code

(explanation below)

.section .text
.globl _start
print:
  movq $1, %rax
  movq $1, %rdi
  movq %rsp, %rsi
  popq %rbx
  popq %rdx

  # Useless Instructions...
  xor %rcx, %rcx    # 36
  xor %rcx, %rcx    # 39
  xor %rcx, %rcx    # 42
  xor %rcx, %rcx    # 45
  xor %rcx, %rcx    # 48
  xor %rcx, %rcx    # 51
  xor %rcx, %rcx    # 54
  xor %rcx, %rcx    # 57
  xor %rcx, %rcx    # 60
  xor %rcx, %rcx    # 63
  xor %rcx, %rcx    # 66

  syscall
  jmp end
  ret

_start:
  pushq $66
  pushq $1
  call print
end:
  movq $60, %rax
  xor %rdi, %rdi
  syscall

Output: "B"


Explanation

Okay here comes the thing. As you see, I want to:

  1. Push $66 ($'B') to the stack
  2. Call the print function
  3. Mov the $66 into the %rsi register
  4. Write to STDOUT with the linux syscall 1, which is the linux _write syscall
  5. Jmp to the end label, otherwise the Error: segmentation fault (core dumped) gets raised


What's wrong

I expect it to output "B", which it does. So where is the problem you may ask? You probably already saw that bunch of xor %rcx, %rcx's.

Let me explain what they do:

  • For some reason it starts at 33 ("!")
  • For every instruction I add to this code, it adds 3 to this number
  • So I added 11 instructions, and the 33 became 66 (33 + 11*3 = 66), which is "B" in ASCII

Without the 11 xor-instructions, it prints "!" which has the ASCII value of 33.

Note: xor %rcx, %rcx is just some instruction. It can be every instruction and every instruction has the value of 3.


That means

I can add as many instructions as I want, to print every ASCII value over 33.

Example:

If I want to echo an "H", all I have to do is to add 2 more xor %rcx, %rcx instructions.

Calculation: 66 ("B") + 3 (1 instruction) + 3 (1 instruction) = 72 ("H")



Final Question

Why? Just why? I just started with x86 ASM, so I am completely new to it. There are clearly some obvious things I am missing here... I just played around a bit and stumbled across this.

Can you help me answer these questions?

  1. How does it work?
  2. How can it affect the output / RSI register?
  3. Why does it start at 33?
  4. Any ideas on how to fix this?

My guess is, it has to do something with the Instruction Pointer, but I have no clue...


Thank you for your help :)

Pixelbog
  • 236
  • 1
  • 8
  • 3
    `call print` places a return address on the stack. You then print that. Of course that return address points to after the `call` which is affected by how much code you have. The `3` is because the `xor %rcx, %rcx` is 3 bytes. Other instructions are not 3 bytes. 64 bit calling convention does not pass arguments on the stack but you are of course free to invent your own. To fix the code you can do `lea 16(%rsp), %rsi` – Jester Jan 09 '23 at 13:17
  • @Jester Omg thanks, ofc that makes so much sense now^^ I bet for someone like you who has so much experience on ASM, questions like this must annoy you, sry for that ^^ And 2 more questions, just to be clear: `1.` It is `16(%rsp)`, because the one `pushq` and the `call` both push 8 Bytes onto the stack, and you want to access the Bytes after the `16 Bytes`. `2.` You don't use the stack, cuz you use the `rdi, rsi, rdx...` registers for arguments. Did I understand everything right? :)) – Pixelbog Jan 09 '23 at 14:09
  • 1
    Yes that is correct. It is not annoying, we are here to help :) – Jester Jan 09 '23 at 14:12
  • 2
    @Jester Wow **Assembly Language** is so much fun :D I still have so much to learn^^ But I really fell in love with it^^ And really thanks for your help and your kindness :) – Pixelbog Jan 09 '23 at 14:22
  • 2
    There are some good resources linked from https://stackoverflow.com/tags/x86/info, including a section at the bottom on using GDB to single-step and watch register values change. Also worth watching Matt Godbolt's CppCon2017 talk “[What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid](https://youtu.be/bSkpMdDe4g4)”. (See also [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116)) – Peter Cordes Jan 10 '23 at 06:01
  • @Jester don't you want to write a short answer so I can close this question?^^ You would get `+25 Rep` :) – Pixelbog Aug 23 '23 at 07:29

0 Answers0