To make proper use of extended-precision math, it's necessary that a language support a type which can be used to store the result of intermediate computations, and can be substituted for the expressions yielding those results. Thus, given:
void print_dist_squared(double x1, double y1, double x2, double y2)
{
printf("%12.6f", (x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}
there should be some type that could be used to capture and replace the common sub-expressions x2-x1
and y2-y1
, allowing the code to be rewritten as:
void print_dist_squared(double x1, double y1, double x2, double y2)
{
some_type dx = x2-x1;
some_type dy = y2-y1;
printf("%12.6f", dx*dx + dy*dy);
}
without altering the semantics of the program. Unfortunately, ANSI C failed to specify any type which could be used for some_type
on platforms which perform extended-precision calculations, and it became far more common to blame Intel for the existence of extended-precision types than to blame ANSI's botched support.
In fact, extended-precision types have just as much value on platforms without floating-point units as they do on x87 processors, since on such processors a computation like x+y+z would entail the following steps:
- Unpack the mantissa, exponent, and possibly sign of x into separate registers (exponent and sign can often "double-bunk")
- Unpack y likewise.
- Right-shift the mantissa of the value with the lower exponent, if any, and then add or subtract the values.
- In case x and y had different signs, left-shift the mantissa until the leftmost bit is 1 and adjust the exponent appropriately.
- Pack the exponent and mantissa back into double format.
- Unpack the that temporary result.
- Unpack z.
- Right-shift the mantissa of the value with the lower exponent, if any, and then add or subtract the values.
- In case the earlier result and z had different signs, left-shift the mantissa until the leftmost bit is 1 and adjust the exponent appropriately.
- Pack the exponent and mantissa back into double format.
Using an extended-precision type will allow steps 4, 5, and 6 to be eliminated. Since a 53-bit mantissa is too large to fit in less than four 16-bit registers or two 32-bit registers, performing an addition with a 64-bit mantissa isn't any slower than using a 53-bit mantissa, so using extended-precision math offers faster computation with no downside in a language which supports a proper type to hold temporary results. There is no reason to fault Intel for providing an FPU which could perform floating-point math in the fashion that was also the most efficient method on non-FPU chips.