2

I've been working through Richard Blum's Professional Assembly Language and its initial floating-point comparison program works but, when run through gdb on Linux, some of the intermediate steps are unexpected.

To experiment:

Here's the program (my comments were added and I dropped a nop that isn't needed in dwarf debugging):

.section .data
value1:
    .float 10.923
value2:
    .float 4.5532

.section .text
.globl _start
_start:

    flds value1     # Load a 32-bit floating-point value into the FPU stack
    fcoms value2    # Compare the value in ST0 with mem address &value2

        # Retrieving the value of the FPU status register (with no argument ...
    fstsw   # ... this defaults to AX)

    sahf    # Store AH into Flags: 00 => CF, 02 => PF, 06 => ZF
    ja greater
    jb lessthan

    movl $1, %eax   # Exit (condition: value1 = value2)
    movl $0, %ebx
    int $0x80

greater:
    movl $1, %eax   # Exit (condition: value1 is greater than value2)
    movl $2, %ebx
    int $0x80
lessthan:
    movl $1, %eax   # Exit (condition: value1 is less than value2)
    movl $1, %ebx
    int $0x80

(I've also got a version of it translated and running in nasm syntax if anyone wants to give that one a whirl -- but my quick test of that didn't seem to yield different results.)

It's built and linked with:

as -o fcomtest.o fcomtest.s --32 --gdwarf-2
ld -o fcomtest fcomtest.o -m elf_i386

I'm using the following gdb-input-script (you may have to add or drop a 'step' command in the last couple lines depending on what value1 and value2 are set as and so which branch is taken):

# suppressing the output when setting the breakpoint at _start 
set logging file /dev/null
set logging redirect on
set logging on
br _start
set logging off

printf "\nFirst outputting values1 and 2:\n"
x/f &value1
x/f &value2

# suppressing the output when the breakpoint is triggered
set logging on
run
set logging off

printf "Now, ST0 = %f\n", $st0

printf "\nChecking the intial value of the FPU status register...\n"
print/t $fstat

printf "\nfcoms value2  # Comparing the value in ST0 with mem address &value2\n\n"

set logging on
step
set logging off

printf "The FPU status register FSTAT now contains...\n"
print/t $fstat

printf "\nBefore copying FSTAT, the AX register contains: "
print/t $ax

printf "\nfstsw     #  Copying (16-bit) FPU status register, FSTAT, to AX"
set logging on
step
set logging off

printf "\n\nNow the contents of the AX register are: \n"
print/t $ax

printf "\nInitially, the EFLAGS register in binary is: \n"
print/t $eflags

printf "\nThat is, the flags set are: "
print $eflags

printf "\nsahf      # store the high 8-bits of AX into the corresponding flags "

set logging on
step
set logging off

printf "\n\nNow, the EFLAGS register contains: \n"
print/t $eflags

printf "\nIn other words, these flags are set: "
print $eflags

printf "\nNow branch according to the CF and ZF flags...\n"
step
step
step
step
q

Finally, to check the debugger output and reproduce this easily, I'm using this command-line:

$ gdb -q -x gdb-input-script > gdb.output fcomtest

You can then see the output churned out with:

$ cat gdb.output .

The theory There is discussion about the mechanism of how this is supposed to work in these posts: x86 assembler: floating point compare , Assembly: JA and JB work incorrectly , and in the linked articles.

In particular, FCOM is supposed to compare the value in the FPU stack's ST0 and another value and change the C3, C2, and C0 code bits of the FPU status register according to:

+-----------------+----+-----+----+
| Condition       | C3 |  C2 | C0 |
+-----------------+----+-----+----+
| ST0 > argument  | 0  |  0  | 0  |
| ST0 < argument  | 0  |  0  | 1  |
| ST0 = argument  | 1  |  0  | 0  |
+-----------------+----+-----+----+

Also, SAHF is supposed to map specific bits to the EFLAGs. Instead: here's an outline of what actually happens with those values input (the "manual" is the Intel documentation):

** The strange part **

Case 4II (value1 +ve, value2 +ve; val1 greater magnitude, val2 lesser magnitude)

15      14      13      12      11      10      09      08  -- FPU status reg.
FPU     C3      SP      SP      SP      C2      C1      C0

07      06      05      04      03      02      01      00  -- AH register

1       1       1       0       0       0       0       0

[In this result of FCOM:  C3 has been set to 1 -- not what was expected.  The
expected result was C3 = 0, C2 = 0, C0 = 0]

SAHF instruction:
From the manual: 07 => SF, 06 => ZF, 04 => AF, 02 => PF, 00 => CF
From the textbook: 06 => ZF, 02 => PF, 00 => CF
Actual behavior: -07 => SF, -06 => ZF , -04 => AF, 02 => PF ?, 00 => CF ? ]

EFLAGS register:

0       0       0       0       0       0       1       0   -- Before SAHF

SF      ZF      -       AF      -       PF      -       CF

0       0       0       1       0       0       1       0   -- After SAHF

[Here, CF = 0 and ZF = 0, so the JA branch is taken and the result is as
desired]

I know that was somewhat long (but it shows how to reconstruct this very easily). In summary: if you change the values1 and 2 and recompile, I'm finding the following actually seems to be occuring -- at least in gnu-debugger's output:


C3      C2      C0
1       0       0   -- Equal numbers case (behavior as described in the documentation)

-----------------------------------------------

1       1       0   -- Value1 < Value2, actual
    (SAHF acts as though it has a NOT applied before altering the flags)

(The status register itself seems to be the inverse of the expected:)
0       0       1   -- expected

-----------------------------------------------

1       0       0   -- Value1 > Value2

0       0       0   -- expected

(Output of SAHF contorts somehow to permit CF = ZF = 0 )

I've described how to reproduce this, so it's just a copy-paste to review the results. Is it simply some wierdness about gdb that changes the values of C3, C2, and C0 then adjusts to make the flags work? In all cases, the correct branch is taken ultimately ... I haven't experimented with (am not practiced with) other debuggers to see if the debugger just goofs the intermediate steps up for value1 != value2 cases.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
muodostus
  • 151
  • 1
  • 7
  • Your script goes `br _start`, `run`, `step`, `print/t $fstat` -- which will give you the `$fstat` after the `flds value1` and before the `fcoms value2`, will it not ? [With assembler I feel more comfortable with `stepi` (`si`), for what it's worth.] If you add `disass $pc,+1` to the script, before the `print`, that will show the *next* instruction to be executed. After `display/i $pc` gdb will automatically show the *next* instruction to be executed whenever it stops -- in particular after an `si`. – Chris Hall Apr 15 '20 at 08:42
  • @ChrisHall: note that `stepi` treats `fstsw` as two separate instructions: the `fwait` and the real `fnstsw`. See https://www.felixcloutier.com/x86/fstsw:fnstsw. (The `fwait` is irrelevant on 286 or 386 and later or something, but assemblers still slavishly follow the defined encodings that includes that `fwait` nop even for 32 and 64-bit code). The OP built with asm-source debug info so `step` will step the whole line, and presumably make their dead-reckoning work out. It could be simpler to use `starti` as an equivalent to setting a breakpoint and using `run`, though. – Peter Cordes Apr 15 '20 at 15:27
  • If we don't trust GDB's `p /x $eflags`, maybe try `pushf`. I get `$eflags = 0x212` but `p /x *(int*)$sp` being `0x312` after `sahf` / `pushf`. Oh, the low bit of the 2nd byte is TF, and I was single-stepping interactively. https://en.wikipedia.org/wiki/FLAGS_register. But other than that they match. – Peter Cordes Apr 15 '20 at 15:34
  • @PeterCordes: thanks for the detail on `fwait`. I think the problem is that the `print/t $fstat` is being done *before* the `fcoms` and that the "dead-reckoning" is then out almost from the beginning. – Chris Hall Apr 15 '20 at 16:06
  • @ChrisHall: IDK, I didn't try to follow that script. Too much of a mess. I just tried it interactively in GDB. I'm not sure I found any mismatches. I had thought AF getting set was a surprise because the low byte of `pushf` didn't match `AH`, but I think once you take into account the reserved bits that `sahf` leaves unset it matches up ok. `0x38` masks down to `0x10`, and the always-set bit #1 in EFLAGS gives us `0x12` – Peter Cordes Apr 15 '20 at 16:17

0 Answers0