Can't output coprocessor float from variable two times in a row

Question

Good afternoon! In this example, I simply add two numbers with a comma, save the variable in tbyte and display the same variable two times in a row on the screen, but the first time I get 11.1, as it should be, and the second time 4.667261E-062. Why is this happening?

And one more question, is it possible in tbyte to somehow save and access numbers by array type? for example, storing numbers in dd, I just could save and read them in increments of 4, for example, result [0], result [4], etc. Is it possible to use the same with tbyte and how? If I understand right - it should be a step of 10.

.386
.model flat,stdcall
option casemap:none

include \masm32\include\masm32rt.inc

.data
titletext db  'Title',0
frmt db 'Result1 = %.7G',10
     db 'Result2 = %.7G',0
buff db 1024 dup (?)
result tbyte ?
num1 qword 5.5
num2 qword 5.6

.code
start:
    finit
    fld qword ptr [num1]
    fld qword ptr [num2]
    fadd
    fstp qword ptr [result]

    invoke crt_sprintf,addr buff,addr frmt, result, result
    invoke MessageBox,0,addr buff,addr titletext,MB_OK
    invoke ExitProcess,0
end start

Why are you doing a `fstp qword` (`double`) into a tbyte (`long double`)? Also, you didn't balance the x87 stack; use `faddp` so it's empty after the `fstp`. — Peter Cordes, Mar 27 '20 at 14:02
@PeterCordes in my situation I need to use `qword` for initial variables and a `tbyte` for result. I don't know how to do this better, any suggestions here? About `faddp`: it is asking me for some kind of operands, what should I include there? — John, Mar 27 '20 at 14:13
I'm surprised `fadd` assembled without any explicit operands but `faddp` didn't; I would have expected the opposite. Anyway, see my answer. But why do you need to use `tbyte` for `result`? — Peter Cordes, Mar 27 '20 at 14:18

Peter Cordes · Accepted Answer · 2020-03-27T14:19:51.397

1

Why are you doing a fstp qword (double) into a tbyte (long double)?

Oh, that's probably your bug. Presumably the invoke macro pushes 12 bytes for each of the result macro args. (Because a 10-byte tbyte padded to a multiple of 4-byte stack slots is 12).

But your format string only tells sprintf to look for double args, which are 8 bytes wide. Since you only stored a qword double to the low 8 bytes of result, crt_sprintf can correctly read the first variadic arg as a double. (x86 is little-endian so the low 8 bytes are at the stack address sprintf is looking at.)

But the 2nd %G conversion will be looking for another double right after the end of the previous arg. Which according to the format string should be 8 bytes later. But what your invoke actually pushed didn't match that. So the 2nd %G reads 8 bytes that overlap the two 12-byte pushes.

It's probably 0 in the upper 4 bytes (including exponent and sign bit), and non-zero only in the low 31 bits of the mantissa, giving you a very small subnormal number. You can use a debugger to examine memory as a double and see that it represents the value sprintf read.

If long double in that C library is the 10 byte x87 type, use %LG and use fstp tbyte.

If sizeof(long double) is only 8, then it's the same as double and you can't printf x87 tbyte values with that C library. (Unless it has some non-standard extension for it.) In that case You just change result to also be a qword, matching the store you're doing.

Also, you didn't balance the x87 stack; use faddp so it's empty after the fstp. (If your assembler requires an operand, use faddp st(1) or st1, however it likes to spell x87 register names.)

You're technically violating the calling convention by making a function call with the x87 stack non-empty, but apparently crt_sprintf doesn't use all 8 of st0..7 so it doesn't get a NaN from overflowing the x87 stack.

edited Mar 27 '20 at 14:19

answered Mar 27 '20 at 14:13

Peter Cordes

328,167
45
605
847

I've updated the code, changed all to `qword`. I kind of understood it, hope I did not miss anything. With this kind of code it gives me `11.1` in both results. But I did not manage to get `faddp` to work. It says that `st(1)` can't be the first operand. And also, if I want my result to be stored in `tbyte`, what could I do for it? I need to convert it somehow before storing? – John Mar 27 '20 at 14:32
@John123: x87 stores convert on the fly according to the operand size you specify. Also, don't edit your question to invalidate answer. (And the code doesn't match the text which still claims it prints wrong). I rolled it back for you. If `%LG` printed a `qword` correctly, then you just can't print tbyte floats with that C library (unless it has some other non-standard format option, or a special function or build option...) – Peter Cordes Mar 27 '20 at 14:36
re: `faddp`: read the Intel manual entry for `faddp` that I linked; see what way of writing it makes your assembler happy. Like maybe `faddp st(1), st(0)` or maybe `faddp st1` if it didn't like `faddp st(1)`. Intel documents `faddp` with no operands as a synonym for `faddp st(1), st(0)`. – Peter Cordes Mar 27 '20 at 14:37
ok, thank you! It works with `faddp st(1), st(0)`. One more question: is there a library that will help me output a `tbyte`? I spend a lot of time even to find a function that could output a float correctly. – John Mar 27 '20 at 14:46
@John123: IDK, maybe. On GNU/Linux, `long double` is a `tbyte`. Unfortunately for you, Microsoft made `long double` = `double` in the standard Windows ABI. See also https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/ - MSVC's CRT even sets the x87 precision control bits to round to `double` so unless you also disable that it's pointless to use `tbyte` anyway. Although `finit` might be doing that, I forget. x87 is mostly obsolete anyway; new code should use SSE2 unless 80-bit extended precision is necessary. Anyway, IDK, maybe ask a new question. – Peter Cordes Mar 27 '20 at 15:17

Can't output coprocessor float from variable two times in a row

1 Answers1

Linked