x86 NASM use printf for packed doubles

Question

I am new to working with SIMD instructions and am trying to use printf to print floats. I have looked at many possible solutions but nothing seems this code doesn't print anything when run. Here is the relevant code:

extern _printf

section .text
global _main

_main:
    ...
    movapd xmm0, oword [rel v1]
    movapd xmm1, oword [rel v2]
    addpd xmm0, xmm1
    movapd xmm1, xmm0
    psrldq xmm1, 8

    mov rax, 2
    mov rdi, fmt
    call _printf
    ...

section .data
fmt: db "%f %f\n", 0
v1: dq 1.1
    dq 2.2
v2: dq 3.3
    dq 4.4

I am working on a mac and here are the commands I am using to assemble and link:

nasm -g -f macho64 -o prog.o prog.asm
ld -lc -macosx_version_min 10.13 -lSystem -o prog prog.o

Did you check out the calling convention? Why do you expect `printf` to read two doubles from xmm0? — fuz, Dec 24 '17 at 00:23
I am not sure but suspected that I was only passing floats in one register (xmm0). Would you suggest putting one float into xmm1 and moving 2 into rax? — genghiskhan, Dec 24 '17 at 00:25
Yes. The x86-64 System V calling convention doesn't pack separate FP args into one vector register. And BTW, you're printing `double`, not `float`. printf has no conversion for `float` because a C caller can never pass one without promotion to `double`. — Peter Cordes, Dec 24 '17 at 00:39
I changed it to passing them in two registers and edited the code in the question however the program still does not print anything. Also, I'm aware I am printing doubles, I'm just using lazy terminology — genghiskhan, Dec 24 '17 at 00:48
Well don't be lazy; based on your question title I was expecting a duplicate of https://stackoverflow.com/questions/37082784/how-to-print-a-single-precision-float-with-printf. — Peter Cordes, Dec 24 '17 at 00:56

Peter Cordes · Accepted Answer · 2018-03-19T21:45:53.347

You're probably exiting without flushing stdio buffers. By default, stdout is line-buffered when connected to a terminal (or full-buffered otherwise).

Your format string doesn't end with a newline: it ends with a literal backslash and n, because NASM doesn't process C escape sequences inside double-quotes. Use backticks for that:

fmt: db `%f %f\n`, 0

Or use the numeric ASCII code fmt: db "%f %f", 10, 0.

When you're using C stdio function calls, you should exit by returning from main, or by calling libc's exit function, not by making a sys_exit system call directly. The library function flushes stdio buffers first and runs destructors and whatever else; the system call just exits.

I'm assuming here that your program is exiting cleanly instead of crashing inside printf (which it might if rsp isn't 16-byte aligned before the call, from using movaps to store the FP arg-passing registers to the stack as part of the usual variadic function code-gen.)

Run your program under strace or ltrace to decode system call or library function calls (if OS X has both those tools).

Your original code (before the update to fix this problem) should be printing the low double from xmm0 and taking 8 bytes of data from the stack for the 2nd %f conversion (because al=1 means one FP arg in registers, with any remaining FP args on the stack.)

Or that's what it puts in the stdout I/O buffer before you exit.

BTW, don't forget to ALIGN 16 the data you're going to use aligned loads on. Also, you picked an inefficient way to unpack (integer shuffle is not needed here, and if you're going to integer shuffle, use pshufd for a copy+shuffle). You could have done this:

DEFAULT REL

...
movapd xmm0, [rel v1]
addpd  xmm0, [rel v2]
movhlps xmm1, xmm0         ; false dependency on the old value of xmm0

Or

...
movapd    xmm1, xmm0    ; copy
unpckhpd  xmm1, xmm1    ; broadcast the high half

The x86-64 System V calling convention doesn't require the upper half of arg-passing registers to be zero, including xmm regs, so you can leave whatever high garbage you want. (Caveat for integer regs for compat with clang)

x86 NASM use printf for packed doubles

1 Answers1