0

I'm working on x86-64 assembly using nasm. My task is to write a code that accepts a number as command line argument and converts it into integer. The code works normally when linked using ld linker. But if GCC linker is used, a segmentation fault occurs.

Here is the code I wrote:

Projekat.asm:

    %include "Macros.asm"

    section .data

    section .bss
    
    section .text
    global main
    main:
        mov rax, [rsp + 16]      ; This line is used to store the second argument (which is the number) 
                             ; into rax register
        call _convertToInt
        newline                  ; newline, exit and rax are macros located in Macros.asm
        exit 

    _convertToInt: 
        mov rdi, rax

        mov rax, 0
        _convertLoop: 
            movzx rcx, byte [rdi]   ; By using GDB, I found out that Segfault occurs on this line!
            cmp rcx, 0
            je _end
            sub rcx, 48
            imul rax, 10
            add rax, rcx
            inc rdi 
            jmp _convertLoop

        _end: 
            printValue rax      
            ret

Macros.asm:

    section .data
        newline db 10, 0
        newline_length equ $-newline 

    section .bss
        digitSpace resb 100
        digitSpacePos resb 8




    %macro newline 0
        mov rax, 1
        mov rdi, 1
        mov rsi, newline 
        mov rdx, 1 
        syscall
    %endmacro

    %macro exit 0
        mov rax, 60
        mov rdi, 0
        syscall
    %endmacro

    %macro printValue 1
        mov rax, %1

        mov rcx, digitSpace
        mov [digitSpacePos], rcx 

    %%printValLoop:
        mov rdx, 0 
        mov rbx, 10
        div rbx   
        mov r14, rax  
        add rdx, 48  

        mov rcx, [digitSpacePos]
        mov [rcx], dl  
        inc rcx 
        mov [digitSpacePos], rcx

        mov rax, r14 
        cmp rax, 0
        jne %%printValLoop

    %%printValFinalLoop:
        mov rcx, [digitSpacePos]

        mov rax, 1
        mov rdi, 1
        mov rsi, rcx
        mov rdx, 1
        syscall

        mov rcx, [digitSpacePos]
        dec rcx
        mov [digitSpacePos], rcx

        cmp rcx, digitSpace
        jge %%printValFinalLoop
    %endmacro

I'm using 64-bit Ubuntu 20.04. Here are the commands that I used:

nasm -f elf64 -g -F dwarf Projekat.asm
ld -o Projekat Projekat.o
./Projekat 485

;In this case, the code runs normally and prints the number!

However, if I use GCC as a linker, this happens:

nasm -f elf64 -g -F dwarf Projekat.asm
gcc Projekat.o -static -o Projekat
./Projekat 485

; Causes segfault

I also used backtrace command and this is how the stack looks like:

#0  _convertLoop () at Projekat.asm:59
#1  0x0000000000401c0a in main () at Projekat.asm:13

I've been trying to solve this for 7 days now and if anyone knows what's the problem behind segfault and why is code working normally with ld and not with GCC please let me know. Thanks to all in advance!

EDIT: Changing main to _start makes the code work properly with ld linked. However, by changing _start to main and using gcc, the code no longer works.

Overdrive
  • 1
  • 2
  • 1
    When you use gcc it links in the C library which has the process entry point and calls your `main` according to the calling convention. `[rsp + 16]` is then not the second command line argument. – Jester Feb 20 '23 at 22:57
  • When using `ld` directly, you left out the warning `ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000`. (Unless you changed the source between builds, but you didn't mention doing that in your question. `main` is a function, called according to the normal ABI for passing function args, unlike `_start` which is the process entry point.) – Peter Cordes Feb 21 '23 at 01:07
  • Forgot to mention, I've tried that too. It seems like the mistake is not there either. – Overdrive Feb 21 '23 at 07:03
  • 1
    Just changing the symbol name doesn't fix anything because the same machine code ends up in the executable; what I meant was that even if your source code says `main:`, linking with just `ld` without the CRT startup code will behave like there's an implicit `_start:` / `global _start` at the top of the `.text` section. That's why code that assumes conditions from the ELF entry point can work even if the source calls it `main:`. And I also meant that the warning should alert you to the fact that `main` and `_start` are different things. Jester's comment is the correct answer. – Peter Cordes Feb 21 '23 at 07:19
  • 1
    Thank you Peter & Jester, I finally figured it out. Command line arguments truly work differently when I use gcc. Like Jester said, using gcc means that argc will be stored in rdi, and argv will be stored in rsi. The code is finally working! – Overdrive Feb 21 '23 at 09:17
  • Yeah, GCC links code for its own `_start` that calls your `main`. – Peter Cordes Feb 21 '23 at 21:31

0 Answers0