You're probably exiting without flushing stdio buffers. By default, stdout
is line-buffered when connected to a terminal (or full-buffered otherwise).
Your format string doesn't end with a newline: it ends with a literal backslash and n
, because NASM doesn't process C escape sequences inside double-quotes. Use backticks for that:
fmt: db `%f %f\n`, 0
Or use the numeric ASCII code fmt: db "%f %f", 10, 0
.
When you're using C stdio function calls, you should exit by returning from main
, or by calling libc's exit
function, not by making a sys_exit
system call directly. The library function flushes stdio buffers first and runs destructors and whatever else; the system call just exits.
I'm assuming here that your program is exiting cleanly instead of crashing inside printf (which it might if rsp
isn't 16-byte aligned before the call
, from using movaps
to store the FP arg-passing registers to the stack as part of the usual variadic function code-gen.)
Run your program under strace
or ltrace
to decode system call or library function calls (if OS X has both those tools).
Your original code (before the update to fix this problem) should be printing the low double
from xmm0 and taking 8 bytes of data from the stack for the 2nd %f
conversion (because al=1
means one FP arg in registers, with any remaining FP args on the stack.)
Or that's what it puts in the stdout I/O buffer before you exit.
BTW, don't forget to ALIGN 16
the data you're going to use aligned loads on. Also, you picked an inefficient way to unpack (integer shuffle is not needed here, and if you're going to integer shuffle, use pshufd
for a copy+shuffle). You could have done this:
DEFAULT REL
...
movapd xmm0, [rel v1]
addpd xmm0, [rel v2]
movhlps xmm1, xmm0 ; false dependency on the old value of xmm0
Or
...
movapd xmm1, xmm0 ; copy
unpckhpd xmm1, xmm1 ; broadcast the high half
The x86-64 System V calling convention doesn't require the upper half of arg-passing registers to be zero, including xmm regs, so you can leave whatever high garbage you want. (Caveat for integer regs for compat with clang)