Assembly print variables and values

Question

I have this code

global start

section .text

start:
mov rax,0x2000004
mov rdi,1
mov rsi,msg
mov rdx,msg.len
syscall

mov rax,0x2000004
mov rdi,2
mov rsi,msgt
mov rdx,msgt.len
syscall

mov rax,0x2000004
mov rdi,3
mov rsi,msgtn
mov rdx,msgtn.len
syscall

mov rax,0x2000001
mov rdi,0
syscall

section .data

msg: db "This is a string",10
.len: equ $ - msg

var: db 1

msgt: db "output of 1+1: "
.len: equ $ - msgt

msgtn: db 1
.len: equ $ - msg

I want to print the variable msgtn. I tried msgt: db "output of 1+1", var But the NASM assembler failed with:

second.s:35: error: Mach-O 64-bit format does not support 32-bit absolute addresses

Instead of the variable, I also tried "output of 1+1", [1+1], but I got:

second.s:35: error: expression syntax error

I tried it also without the parantheses, there was no number, but only the string "1+1".

The command I used to assemble my program was:

/usr/local/Cellar/nasm/*/bin/nasm -f macho64 second.s && ld -macosx_version_min 10.7.0 second.o second.o

nasm -v shows:

NASM version 2.11.08 compiled on Nov 27 2015

OS X 10.9.5 with Intel core i5 (x86_64 assembly)

Can you show us the commands you use to assemble and link your program. It would also be good to know what version of NASM you are using. (`nasm -v` should give the version). As well editing your question to give the exact errors will help. — Michael Petch, Dec 11 '15 at 20:29
First thing, I recommend not using 2.11.08. It has some nasty issues (bugs). Get an older one or newer version. With a proper version of NASM commands like `nasm -f macho64 -o second.o second.s` and `ld second.o -o second` should work . I did notice in your update that your linking looks unusual. Your input object and output file are both `second.o` — Michael Petch, Dec 11 '15 at 20:38
But I think the main problem is that you want to print out a number as a string. You can't do that with sys_write syscall directly. You need to convert a number to a string and pass the address of that string to sys_write. Alternatively you can link against the c library and use printf — Michael Petch, Dec 11 '15 at 20:48
@MichaelPetch I edited my question. I Actually was using 2.11.08 as I brew installed it today. How would I convert that number into a string without printf? — John K, Dec 11 '15 at 20:51
One method is to continually divide a number by 10 (until the dividend is 0), and storing the remainder (converted to a character by adding ASCII character '0`) of each division to a buffer in reverse order. If you google you should be able to find some examples of this method. — Michael Petch, Dec 11 '15 at 20:56

score 3 · Accepted Answer · edited May 23 '17 at 11:44

db directives let you put assemble-time-constant bytes into the object file (usually in the data section). You can use an expression as an argument, to have the assembler do some math for you at assemble time. Anything that needs to happen at run time needs to be done by instructions that you write, and that get run. It's not like C++ where a global variable can have a constructor that gets run at startup behind the scenes.

msgt: db "output of 1+1", var

would place those ascii characters, followed by (the low byte of?) the absolute address of var. You'd use this kind of thing (with dd or dq) to do something like this C: int var; int *global_ptr = &var;, where you have a global/static pointer variable that starts out initialized to point to another global/static variable. I'm not sure if MacOS X allows this with a 64bit pointer, or if it just refuses to do relocations for 32bit addresses. But that's why you're getting:

second.s:35: error: Mach-O 64-bit format does not support 32-bit absolute addresses

Notice that numeric value of the pointer depends on where in virtual address space the code is loaded. So the address isn't strictly an assemble-time constant. The linker needs to mark things that need run-time relocation, like those 64bit immediate-constant addresses you mov into registers (mov rsi,msg). See this answer for some information on the difference between that and lea rsi, [rel msg] to get the address into a register using a RIP-relative method. (That answer has links to more detailed info, and so do the x86 wiki).

Your attempt at using db [1+1]: What the heck were you expecting? [] in NASM syntax means memory reference. First: the resulting byte has to be an assemble-time constant. I'm not sure if there's an easy syntax for duplicating whatever's at some other address, but this isn't it. (I'd just define a macro and use it in both places.) Second: 2 is not a valid address.

msgt: db "output of 1+1: ",   '0' + 1 + 1,    10

would put the ASCII characters: output of 1+1: 2\n at that point in the object file. 10 is the decimal value of ASCII newline. '0' is a way of writing 0x30, the ASCII encoding the character '0'. A 2 byte is not a printable ASCII character. Your version that did that would have printed a 2 byte there, but you wouldn't notice unless you piped the output into hexdump (or od -t x1c or something, IDK what OS X provides. od isn't very nice, but it is widely available.)

Note that this string is not null-terminated. If you want to pass it to something expecting an implicit-length string (like fputs(3) or strchr(3), instead of write(2) or memchr(3)), tack on an extra , 0 to add a zero-byte after everything else.

If you wanted to do the math at run-time, you need to get data into register, add it, then store a string representation of the number into a buffer somewhere. (Or print it one byte at a time, but that's horrible.)

The easy way is to just call printf, to easily print a constant string with some stuff substituted in. Spend your time writing asm for the part of your code that needs to be hand-tuned, not re-implementing library functions.

There's some discussion of int-to-string in comments.

Your link command looks funny:

ld -macosx_version_min 10.7.0 second.o second.o

Are you sure you want the same .o twice?

You could save some code bytes by only moving to 32bit registers when you don't need sign-extension into the 64bit reg. e.g. mov edi,2 instead of mov rdi,2 saves a byte (the REX prefix), unless NASM is clever and does that anyway (actually, it does).

lea rsi, [rel msg] (or use default rel) is a shorter instruction than mov r64, imm64, though. (The AT&T mnemonic is movabs, but Intel syntax still calls it mov.)

Assembly print variables and values

1 Answers1