3

I was following this compiled code (I don't know the compiler nor having the source code).

Sub1:
mov edx,[esp+04h]
and edx,00000300h
or  edx,0000007Fh
mov [esp+06h],dx
fldcw   word ptr [esp+06h]
retn

My understanding:

Sub1(4byte param1)
edx=param1&0x00000300|0x0000007F
higher 2 bytes of param1 = lower 2 bytes of edx
fldcw ???????

fldcw loads the control word. But what is the control word of a floating-point?

The result is stored into higher 2 bytes of param1. Am I right?

What could be the purpose of this subroutin?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
barej
  • 1,330
  • 3
  • 25
  • 56
  • Here I show another usage of that register: enabling and disabling floating point CPU exceptions: https://stackoverflow.com/questions/18118408/what-is-difference-between-quiet-nan-and-signaling-nan/55648118#55648118 – Ciro Santilli OurBigBook.com Apr 13 '19 at 20:36

1 Answers1

7

FLDCW is an instruction that loads the 16-bit control word for the x87 FPU. The bit layout of the control word can be found on this Intel web page for example.

The lower eight bits of the control word contain the masks for the IEEE-754 defined exceptions. ORing 0x7F thus masks all floating-point exceptions as bits 6 and 7 are not used.

The upper eight bits of the control word contain the precision control in bits 8 and 9, and the rounding control in bits 10 and 11. By ANDing with 0x300 the precision control PC currently in force is passed through untouched, while the rounding control RC is forced to 0, which corresponds to the IEEE-754 rounding mode "round to nearest or even".

It is impossible to say what exactly the purpose of this function is. It is passed a 4-byte integer on the stack at [esp+4] which is removed by the caller, suggesting C calling conventions. The 4-byte integer passed in is presumably the saved previous value of the FPU control word, stored with FSTCW and zero extended from two to four bytes. The values forced for rounding control and exceptions masks suggest that this function is used to restore some compiler's math library defaults for the x87 control word, but there is no way of knowing this for sure without additional context.

Solomon Ucko
  • 5,724
  • 3
  • 24
  • 45
njuffa
  • 23,970
  • 4
  • 78
  • 130
  • Thank you very much. I am reading the article. However, for right now, I am wondering that if this function is equivalent to `round(x)` , why the input argument is 4 bytes instead of 8 bytes? – barej Aug 08 '15 at 16:15
  • Since the code you showed sets the rounding control `RC` to 0 (round to nearest or even) I don't see how this code would be related to `floor()`. The `floor()` function rounds down, that is to negative infinity, which corresponds to a `RC` value of 1, not 0. The code here looks like it is setting up x87 defaults for a compiler's library, but without additional context that is pure speculation. The FPU control word is a 16-bit integer, presumably the code above is from a subroutine that passes in control word value as zero-extended 4-byte integer argument at [esp+4] but that is also speculation – njuffa Aug 08 '15 at 16:15
  • before this subroutine being called a line `fstcw word ptr [esp]` is performed. Probably it is related to what you mean. If an FPU control word is passed to the function, where is the floating point number to be rounded? is it in `ST(0)` of `FPU`? – barej Aug 08 '15 at 16:21
  • 3
    This code merely sets up the x87 FPU in a certain way. For what purpose it is set up this way is impossible to say, but the value written to the control word suggest that this function is supposed to restore a default setup for a math library. This code just sets up the mode the FPU is operating in, it does not act on any values in the floating-point registers. – njuffa Aug 08 '15 at 16:27
  • The upstream code checks whether the control word is `0x27F` which looks like the default used on Windows ('PC' = double precision, 'RC' = round-to-nearest-or-even, FPU exceptions masked). If the value is not the default of `0x27F` (e.g. because user code modified the control word) it forces the defaults by calling subroutine 1. However, this forcing performed seems potentially flawed since it does not adjust 'PC' to 2 (double precision). If user code had set 'PC' to 0 (single precision), downstream code could fail. But here next instruction is `FYL2X` which does not obey `PC`, so no problem. – njuffa Aug 08 '15 at 17:05