0

Im really beginner of assembly language, and trying to compile the C code and reverse it with otool into assembly language.

int main() {
    printf("a");
    return 0;
}

Then, after gcc -o main main.c; otool -tvV main

main:
(__TEXT,__text) section
_main:
0000000100000f60        pushq   %rbp
0000000100000f61        movq    %rsp, %rbp
0000000100000f64        subq    $0x10, %rsp
0000000100000f68        movl    $0x0, -0x4(%rbp)
0000000100000f6f        leaq    0x34(%rip), %rdi ## literal pool for: "a"
0000000100000f76        movb    $0x0, %al
0000000100000f78        callq   0x100000f8a ## symbol stub for: _printf
0000000100000f7d        xorl    %ecx, %ecx
0000000100000f7f        movl    %eax, -0x8(%rbp)
0000000100000f82        movl    %ecx, %eax
0000000100000f84        addq    $0x10, %rsp
0000000100000f88        popq    %rbp
0000000100000f89        retq

if i change this print(a) to print another string:

main:
(__TEXT,__text) section
_main:
0000000100000f50        pushq   %rbp
0000000100000f51        movq    %rsp, %rbp
0000000100000f54        subq    $0x10, %rsp
0000000100000f58        movl    $0x0, -0x4(%rbp)
0000000100000f5f        leaq    0x34(%rip), %rdi ## literal pool for: "iiidachsaljhdlasjdlsajdsajkha"
0000000100000f66        movb    $0x0, %al
0000000100000f68        callq   0x100000f7a ## symbol stub for: _printf
0000000100000f6d        xorl    %ecx, %ecx
0000000100000f6f        movl    %eax, -0x8(%rbp)
0000000100000f72        movl    %ecx, %eax
0000000100000f74        addq    $0x10, %rsp
0000000100000f78        popq    %rbp
0000000100000f79        retq

then i can see the comment (##) says it is trying to print another string literals, but as long as i read assembly language, cannot tell how it defines specific literals (no difference found between the former and the latter assembly (reverse result).

Could anyone teach me how it works? assembly language hides the definition of string literals???

fuz
  • 88,405
  • 25
  • 200
  • 352
suganology
  • 83
  • 8
  • 2
    Instead of compiling into machine code and then disassembling, try asking the compiler to generate assembly code directly using the option `-S`. The result is actual assembly code and not some sort of reconstruction as the disassembly is. – fuz May 27 '20 at 02:36
  • 1
    That isn't assembly language proper, it's *disassembly* in `otool`'s output format. See [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) for how to look at compiler output. Also, turns out there was already a Q&A about looking at disassembly of code that referenced a string literal so this is a duplicate. – Peter Cordes May 27 '20 at 02:41
  • 1
    Based on my experience on Windows platform, when you compile your C code, the hard-coded strings, like the `"a"` and `"iiidachsaljhdlasjdlsajdsajkha"` in the `printf()`, are treated as "global string constants". Though there was change in the string content, the address of this global string constant did not change. Therefore, there is no difference in your assembly code, which resides in the .text segment of the executable. However, if you check the executable's .rdata segment, you will find your `"a"` changed to `"iiidachsaljhdlasjdlsajdsajkha"`. – zhugen May 27 '20 at 02:50

0 Answers0