0

I'm aware of floating point inaccuracies, but I need a way around them.

So, I need to do a graph traversal, and I'm doing it by getting all the nodes into the stack, adding routes to all the neighbors, then adding a large route to all the non-neighbors, finally going through and finding the best routes. It works fine for the routes that were created from the neighbors, but it breaks on the routes that were created with a large number.

Here's the relevant sections of code.

Here the large route is assigned.

movsd xmm0, [big]
mov [r11+rax+TAR], r9
movsd [r11+rax+DIS], xmm0
mov [r11+rax+HOP], r9

Here the comparison is made, in order to see if a better route is found.

ucomisd xmm0, xmm2
ja  bMBD
jmp eMBD
bMBD:
    mov rax, [r8+ROT]
    mov [rax+r13+HOP], r11
    movsd [rax+r13+DIS], xmm2
eMBD:

The above block of code is where the problem is, following is a breakdown of what I believe the code is doing.

D asks D for better route, but nothing happens: good.

D asks A, finds one and calculates distance as 3.3+5.4=8.7, it's better and replaces it's current route: good.

D asks B, finds one with a distance of 8.7+0=8.7. For some reason this is regarded as better and it's route is replaced: bad.

So, why is the ja condition returning that 8.7 > 8.7, but only on the routes that had their distance initialized to a large number (I've tried reducing the size of the number, but that has yielded the same results)?

Thank You.

Community
  • 1
  • 1
DeafFrog
  • 75
  • 9
  • Post debugger output after the `ucomisd` instruction, showing `xmm0`, `xmm2` and `eflags`. – Jester Nov 10 '16 at 12:11
  • If that's hand-tuned asm, beware that indexed addressing modes like `[r11+rax+TAR]` [don't stay micro-fused in the out-of-order core for Intel CPUs pre-SKL](http://stackoverflow.com/questions/26046634/micro-fusion-and-addressing-modes/31027695#31027695). So you'd actually save a fused-domain uop by using an ALU instruction to calculate an address in a register, so then all three stores could micro-fuse. If you don't have a spare register, or ALU uops are a bottleneck but the frontend (total fused-domain uops) isn't, then you shouldn't change anything. – Peter Cordes Nov 11 '16 at 04:13

1 Answers1

3

Rounding differences are a fact of life in floating point math. This is explained here.

If you only need to find any one of the best routes, then I see no problem in 3.3 + 5.4 > 8.7 + 0.0; either one of the routes is fine.

If you need to retrieve all routes that share the same optimal distance, then I agree you really need 3.3 + 5.4 to be equal to 8.7 + 0.0. There are a few ways to accomplish this; see below. For sample code, I will be using pseudo code rather than assembly, since this issue is not specific to any language.

1. Ignore minor differences

Consider numbers to be equal if the absolute value of their difference is less than a certain small number (e.g. 0.001). This means the comparison logic becomes slightly more complex. Instead of:

if x < y then
    return LESS_THAN
elseif x > y then
    return GREATER_THAN
else
    return EQUAL

you would be doing:

epsilon = 0.001
if x < y - epsilon then
    return LESS_THAN
elseif x > y + epsilon then
    return GREATER_THAN
else
    return EQUAL

2. Use fixed point numbers

This is easier than it sounds; just multiply everything with a certain factor that will eliminate the fractional part of every number, and perform all your calculations based on integers rather than floats.

For distances NNNN.N (i.e. no more than one decimal digit behind the decimal point), this factor is 10.

So instead of:

routeA = 3.3 + 5.4
routeB = 8.7 + 0.0
if routeA > routeB then ...

you would be doing:

routeA = 33 + 54
routeB = 87 + 0
if routeA > routeB then ...
Community
  • 1
  • 1
Ruud Helderman
  • 10,563
  • 1
  • 26
  • 45