3

I'm trying to learn a little bit of assembly, so I've been following the most basic tutorial I can find that takes into account differences between macOS and linux-based OSs. I'm running macOS Catalina 10.15.1.

Unfortunately, I can't get the examples to link. Following along with this answer, I was able to link the basic hello.asm by adding the -lSystem flag to ld, i.e., I saved as hello.asm the following code

; ----------------------------------------------------------------------------------------
; Writes "Hello, World" to the console using only system calls. Runs on 64-bit macOS only.
; To assemble and run:
;
;     nasm -fmacho64 hello.asm && ld hello.o && ./a.out
; ----------------------------------------------------------------------------------------

          global    start

          section   .text
start:    mov       rax, 0x02000004         ; system call for write
          mov       rdi, 1                  ; file handle 1 is stdout
          mov       rsi, message            ; address of string to output
          mov       rdx, 13                 ; number of bytes
          syscall                           ; invoke operating system to do the write
          mov       rax, 0x02000001         ; system call for exit
          xor       rdi, rdi                ; exit code 0
          syscall                           ; invoke operating system to exit

          section   .data
message:  db        "Hello, World", 10      ; note the newline at the end

and instead of running the given linking / execution commands, ran

nasm -f macho64 hello.asm && ld -lSystem hello.o && ./a.out

and it successfully linked and wrote Hello, World to stdout.

However, when I move onto the second example,

; ----------------------------------------------------------------------------------------
; This is an OSX console program that writes a little triangle of asterisks to standard
; output. Runs on macOS only.
;
;     nasm -fmacho64 triangle.asm && gcc hola.o && ./a.out
; ----------------------------------------------------------------------------------------

          global    start
          section   .text
start:
          mov       rdx, output             ; rdx holds address of next byte to write
          mov       r8, 1                   ; initial line length
          mov       r9, 0                   ; number of stars written on line so far
line:
          mov       byte [rdx], '*'         ; write single star
          inc       rdx                     ; advance pointer to next cell to write
          inc       r9                      ; "count" number so far on line
          cmp       r9, r8                  ; did we reach the number of stars for this line?
          jne       line                    ; not yet, keep writing on this line
lineDone:
          mov       byte [rdx], 10          ; write a new line char
          inc       rdx                     ; and move pointer to where next char goes
          inc       r8                      ; next line will be one char longer
          mov       r9, 0                   ; reset count of stars written on this line
          cmp       r8, maxlines            ; wait, did we already finish the last line?
          jng       line                    ; if not, begin writing this line
done:
          mov       rax, 0x02000004         ; system call for write
          mov       rdi, 1                  ; file handle 1 is stdout
          mov       rsi, output             ; address of string to output
          mov       rdx, dataSize           ; number of bytes
          syscall                           ; invoke operating system to do the write
          mov       rax, 0x02000001         ; system call for exit
          xor       rdi, rdi                ; exit code 0
          syscall                           ; invoke operating system to exit

          section   .bss
maxlines  equ       8
dataSize  equ       44
output:   resb      dataSize

I get an error from ld. (Note the typo in the bash command given. I actually tried nasm -f macho64 triangle.asm && ld -lSystem triangle.o && ./a.out.) The message is:

Undefined symbols for architecture x86_64:
  "_main", referenced from:
     implicit entry/start for main executable
ld: symbol(s) not found for architecture x86_64

This is the same error as when I link the first example without the -lSystem flag; however, here adding the -lSystem flag does not solve the problem.

At the most concert level, what emendations do I have to make to triangle.asm or the command I'm using to link it to solve this "undefined symbols" error? At a more general level, why am I getting this error at all? The symbol _main doesn't appear anywhere in my code. Does my object file actually get "wrapped" in something else by the linker to create a runnable executable? If so, is this a necessary default? Why can't the processor effectively just run these instructions from top to bottom? I would have assumed that this code is sufficiently low-level that it can be run without linking to any additional libraries.

jgaeb
  • 197
  • 8
  • 1
    Note that generally, the `-l` operand needs to go after the object files that use the library. It is possible that macOS uses lld as a linker which doesn't have this restriction, but neverthless it's a good idea to do it correctly. – fuz Dec 14 '19 at 17:52
  • 1
    The error message seems to hint that the default entry point is `_main`. Either use that, or specify `-e start` to override it. – Jester Dec 14 '19 at 18:12
  • 1
    **This tutorial seems like partial nonsense**. It's not plausible that `gcc hola.o` could link the second example; it doesn't define `_main`, but MacOS decorates symbols with a leading `_` and gcc will link the CRT start files which look for a `_main` as if from compiling a `.` file with `int main(){}`. Also, `nasm -fmacho64 triangle.asm` doesn't create `hola.o`. The code might be correct, but the build commands don't even look close to right. – Peter Cordes Dec 14 '19 at 18:20
  • Ok. It seems like a better overall strategy is perhaps to go trough a Linux tutorial on a remote machine until I understand x86 passably well, because my impression from reading some more is that lots of syscalls are undocumented, evidently available learning materials aren’t correct, and generally things seem like a bit of a mess on macOS. – jgaeb Dec 14 '19 at 19:36
  • 2
    @jgaeb Generally speaking, you don't want to do direct system calls on most operating systems as they are not guaranteed to be stable. Linux is one of the few exceptions. Instead, call into the libc to perfom system calls. – fuz Dec 14 '19 at 23:44

2 Answers2

3

Already tested in macOS 10.15.

% nasm -f macho64 -o tri.o triangle.asm
% ld -o tri -e start tri.o -macosx_version_min 10.15 -static
% ./tri

output: enter image description here

jbnunn
  • 6,161
  • 4
  • 40
  • 65
Niall Lv
  • 31
  • 2
3

Just ld -o tri tri.o -static seems to do the job as well. Looks like the missing ingredient was -static parameter.

Accy
  • 31
  • 1