0

I have this assembly program (test.asm):

global main
extern _printf

section .text
main:
  push  rbp
  mov   rdi, message2
  call  _printf
  pop   rbp
  ret

section .data
message1: db "foo", 10, 0
message2: db "bar", 10, 0

My problem is that it doesn't print anything. If I use message1 it works, just not with message2. What am I doing wrong?

When I look at the final executable, it seems that the address in the MOV instruction points to the address right after the message2 string. There are just null bytes there. In test.o the address points to two bytes before the proper start of the string.

I assemble and link it like this:

nasm -f macho64 -o test.o test.asm
ld -arch x86_64 -macos_version_min 10.14 -no_pie -lSystem -e main -o test test.o

I'm using NASM 2.13.03. I'm on macOS 10.14.6 on a MacBook Pro with an Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz.


Ghidra output for test.o:
                     //
                     // __text 
                     // __TEXT
                     // ram: 00000000-00000011
                     //
                     undefined main()
     undefined         AL:1           <RETURN>
                     main
00000000 55              PUSH       RBP
00000001 48 bf 17        MOV        RDI,0x17
         00 00 00 
         00 00 00 00
0000000b e8 f0 0f        CALL       _printf
         00 00
00000010 5d              POP        RBP
00000011 c3              RET
                     //
                     // __data 
                     // __DATA
                     // ram: 00000012-0000001b
                     //
                     message1
00000012 00              ??         00h
00000013 00              ??         00h
00000014 66              ??         66h    f
00000015 6f              ??         6Fh    o
00000016 6f              ??         6Fh    o
                     message2
00000017 0a              ??         0Ah
00000018 00              ??         00h
00000019 62              ??         62h    b
0000001a 61              ??         61h    a
0000001b 72              ??         72h    r
                     //
                     //  
                     // ram: 0000001c-0000001d
                     //
0000001c 0a              ??         0Ah
0000001d 00              ??         00h
[snip]

And for the final executable:

[snip]
                      //
                      // __text 
                      // __TEXT
                      // ram: 100000f84-100000f95
                      //
                      undefined entry()
      undefined         AL:1           <RETURN>
                      main
                      entry
100000f84 55              PUSH       RBP
100000f85 48 bf 22        MOV        RDI=>DAT_100001022,DAT_100001022
          10 00 00 
          01 00 00 00
100000f8f e8 02 00        CALL       __stubs::_printf
          00 00
100000f94 5d              POP        RBP
100000f95 c3              RET
[snip]
                      //
                      // __data 
                      // __DATA
                      // ram: 100001018-100001021
                      //
                      message1
100001018 66              ??         66h    f
100001019 6f              ??         6Fh    o
10000101a 6f              ??         6Fh    o
10000101b 0a              ??         0Ah
10000101c 00              ??         00h
                      message2
10000101d 62              ??         62h    b
10000101e 61              ??         61h    a
10000101f 72              ??         72h    r
100001020 0a              ??         0Ah
100001021 00              ??         00h
                      //
                      // __DATA 
                      // __DATA
                      // ram: 100001022-100001fff
                      //
                      DAT_100001022
100001022 00              ??         00h
100001023 00              ??         00h
100001024 00              ??         00h
[snip]
mistercake
  • 692
  • 5
  • 14
  • Post proof of your claim, such as disassembly of the relevant part of the executable along with the strings. PS: you should zero `al` before calling `printf`. – Jester Oct 09 '19 at 18:06
  • I added what Ghidra shows me. I hope that will do. Why do I need to zero `al`? – mistercake Oct 09 '19 at 18:20
  • 1
    Because the calling convention says it should contain the number of SSE registers used for passing arguments to varargs functions (and `printf` is one). But that's unrelated to your problem which is interesting. Does it work if you use `lea rdi, [rel message2]` ? – Jester Oct 09 '19 at 18:24
  • 2
    @Jester: this smells like one of the known bugs with NASM's mach-o output format. IIRC, the more recent one affects 64-bit absolute instead of RIP-relative. – Peter Cordes Oct 09 '19 at 18:27
  • Ah yeah I keep forgetting about that bug. – Jester Oct 09 '19 at 18:28
  • 1
    `lea rdi, [rel message2]` works! Thank you. – mistercake Oct 09 '19 at 18:29

0 Answers0