1

I have started learning the NASM assembler & RE. The first problem I've got is the next (in short):

I can't restore the decompiled program using the objconv utility.

My simple app:

#include <stdio.h>

char* msg = "Hello World!";

int main(void) {
    printf("%s\r\n", msg);
    return 0;
}

1). The first step, I've done was:

gcc -fno-asynchronous-unwind-tables -s -c -o 1.o 1.c

The flag: fno-asynchronous-unwind-tables was used for NOT generating the unnecessary sections in the output object file.

2). Then, I've used the objconv utility in such way:

objconv -fnasm 1.o

for generating the assembly code for the NASM assembler, I've got the next:

; Disassembly of file: 1.o
; Sun Aug 27 23:56:53 2017
; Mode: 64 bits
; Syntax: YASM/NASM
; Instruction set: 8086, x64

default rel

global main: function
global msg

extern printf                                           ; near


SECTION .text   align=1 execute                         ; section number 1, code

main:   ; Function begin
        push    rbp                                     ; 0000 _ 55
        mov     rbp, rsp                                ; 0001 _ 48: 89. E5
        mov     rax, qword [rel msg]                    ; 0004 _ 48: 8B. 05, 00000000(rel)
        mov     rsi, rax                                ; 000B _ 48: 89. C6
        mov     edi, ?_001                              ; 000E _ BF, 00000000(d)
        mov     eax, 0                                  ; 0013 _ B8, 00000000
        call    printf                                  ; 0018 _ E8, 00000000(rel)
        mov     eax, 0                                  ; 001D _ B8, 00000000
        pop     rbp                                     ; 0022 _ 5D
        ret                                             ; 0023 _ C3
; main End of function


SECTION .data   align=8 noexecute                       ; section number 2, data

msg:                                                    ; qword
        dq Unnamed_4_0                                  ; 0000 _ 0000000000000000 (d)


SECTION .bss    align=1 noexecute                       ; section number 3, bss


SECTION .rodata align=1 noexecute                       ; section number 4, const

        db 48H, 65H, 6CH, 6CH, 6FH, 20H, 57H, 6FH       ; 0000 _ Hello Wo
        db 72H, 6CH, 64H, 21H, 00H                      ; 0008 _ rld!.

?_001:                                                  ; byte
        db 25H, 73H, 0DH, 0AH, 00H                      ; 000D _ %s...

3). The next step was in:

Removing the unnecessary parts like:

  • the align=N and execute/noexecute words from the .SECTION lines
  • : function from the global declaration
  • the default rel line

Fixing the msg: dq Unnamed_4_0. I have thought, that this part is rather compromised/broken after using the objconv.

So, I've change the:

dq Unnamed_4_0

To the:

db "Hello World",10

Despite on having the section: .rodata (I'm thinking, that my problem was exactly with the incorrect using of the string for output...).

4). Then I have used the next commands in shell:

nasm -f elf64 1.asm
gcc 1.o

When, I'm launching the a.out file after gcc, I've got the next error:

user@:~/Desktop/tmp$ ./a.out
Segmentation fault (core dumped)

This is how I have failed to restore the original behavior from the disassembled program. The original program was compiled via:

gcc -std=c99 -o 1 1.c

The source code of it was published at the beginning of my question. The aim I want to achieve is simple: to build the executable from using the objconv -> nasm approach, which acts as the original executable.

  • You're disassembling, not decompiling. This is not safe in general, because NASM syntax can't represent different instruction-encoding choices that might e.g. leave necessary padding in the PLT. If you're just keeping the disassembled `main`, then that should be ok. Probably you broke something, and you should use `gdb ./a.out` to find out what. (See the gdb for asm tips at the bottom of [the x86 tag wiki](http://stackoverflow.com/tags/x86/info).) – Peter Cordes Aug 27 '17 at 22:28
  • @PeterCordes thanks for the good & constructive answer –  Sep 07 '17 at 13:11

1 Answers1

0

You wrote char* msg = "Hello World!";, so you have a pointer to a string literal stored in the read/write .data section. That's the dq Unnamed_4_0.

If you had written char msg[] = "Hello World!";, the bytes at msg would be the literal string, and main would pass a pointer to printf with mov edi, msg (i.e. mov r32,imm32).

You changed the asm so the bytes at msg: are string data rather than a pointer. But your main is still passing the first 8 bytes stored at msg to printf as a pointer (as the const char *fmt arg).

When printf tries to dereference those ASCII bytes as a pointer to the format string, it segfaults. (Remember that in C, strings are passed by reference.)


BTW, I'd suggest compiling with gcc -Og or -O1 at least. It will be a lot easier to read the asm when it isn't storing/reloading everything to the stack after every C statement (to let you change any variable with a debugger). See also How to remove "noise" from GCC/clang assembly output? for more tips.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847