Why is the conversion in non-strictfp mode considered as the one losing information?

Question

I understand the conversion in strictfp mode is used for portability, not for accuracy as noted in this question. However, The Java Language Specification, Java SE 8 Edition says that

A widening primitive conversion from float to double that is not strictfp may lose information about the overall magnitude of the converted value.

which sounds to me that a widening primitive conversion that is strictfp is intended for accuracy. Furthermore, I suspect that double can represent literally all the values that float can take, in which I see no reason why a conversion from float to double is an issue here.

EDIT:

The wording "... may lose information about..." in the spec gave me a feeling of the conversion in non-strictfp mode lakcing some kind of accuracy as compared to that in strictfp mode. It did not make sense to me because the conversion in non-strictfp mode possibly makes use of intermediate values in higher precision. This question was first written based on this understanding, and might not look as desirable as you expected.

"*...that a widening primitive conversion that is strictfp is intended for accuracy*" ... well, I suppose. A practical view is that all floating-point representations are approximations of actual values, even if a subset of values may be fortuitously represented exactly. — scottb, Jan 31 '16 at 08:18
@scottb The linked answer suggests that, without a strictfp keyword, the value is more likely to be accurate. It seems to me that a strictfp expression isn't always an accurate one. — eca2ed291a2f572f66f4a5fcf57511, Jan 31 '16 at 08:26
A non strictfp float may have an exponent that is smaller than the smallest double exponent. A strictfp float cannot. — Patricia Shanahan, Jan 31 '16 at 10:42
@PatriciaShanahan If that's true, isn't a non-strictfp that is more accurate? Why does The Java Language Specification describe it as the one losing information? — eca2ed291a2f572f66f4a5fcf57511, Jan 31 '16 at 11:26
@Il-seobBae Non-strictfp can avoid underflows to zero that would have happened in strictfp. In that sense, it is more accurate. — Patricia Shanahan, Jan 31 '16 at 11:28
@PatriciaShanahan Do you mean to say that an underflow is necessary to adequately represent a numeric value? I got your point, but I barely understand it. Can you give me an specific example which suggests that an underflow is a reasonable result? — eca2ed291a2f572f66f4a5fcf57511, Jan 31 '16 at 11:59
What do you mean by "more accurate"? If you mean that the range of representable values is larger, that may or may not be true, and if it is true, it may or may not be true in a consistent fashion. This doesn't necessarily have any bearing on whether *code written to use* such types will continue working as intended without strictfp. Put another way, having floats that are "more accurate" can actually cause the end result of a computation to be more bogus. And not because the computation is poorly-engineered. — tmyklebu, Jan 31 '16 at 12:29

score 3 · Accepted Answer · answered Jan 31 '16 at 18:01

3

Intel's IA64 architecture uses a fixed floating point register format, one sign bit, 17 exponent bits, and 64 significand bits. When a floating point number is stored from one of those registers into a 32 or 64 bit variable, it has to be converted.

Java aims for consistent results, so it is undesirable for the value of an expression to change depending on whether an intermediate result was held in a register or as an in-memory float or double.

Originally, Java simply insisted on all calculations being done as though all intermediate results were stored. That turned out to give poor performance due to the difficulty of forcing the exponent into the right range on each calculation. The solution was to give the programmer the choice between a fully consistent strictfp mode, and a more relaxed mode in which an exponent could go outside the range for the expression type without the value being forced to a zero or infinity.

Suppose, in relaxed mode, an in-register float has an exponent outside the double exponent range, and is being converted to an in-memory double. That conversion will force the value to a zero or infinity, losing its magnitude. That is an exception to the general rule that widening arithmetic conversions preserve the overall magnitude.

If the same calculation were done in strictfp mode, the float would not have been allowed to have an exponent outside the float exponent range. Whatever calculation generated it would have forced the value to a zero or infinity. Every float value is exactly representable in double, so the conversion does not change the value at all, let alone losing the overall magnitude.

answered Jan 31 '16 at 18:01

Patricia Shanahan

25,849
4
38
75

I think double-extended uses 15 exponent bits. – Pascal Cuoq Feb 01 '16 at 01:57
I'm terribly sorry to keep asking you this so far. I sincerely thank you for letting me know all the details behind the specification. It was really helpful. – eca2ed291a2f572f66f4a5fcf57511 Feb 01 '16 at 15:51
Can I ask you a simple question based on your answer? When you said that "*Non-strictfp can avoid underflows to zero...*" in the comment, was it supposed to mean to say that **an operation on non-strictfp floats** can result in an underflow when converted to an in-memory float/double? I mean, **simply moving a non-strictfp in-memory float to a non-strictfp in-memory double** doesn't seem to be a problem. – eca2ed291a2f572f66f4a5fcf57511 Feb 01 '16 at 16:26
@Il-seobBae The only in-memory representation of double is, in effect, IEEE 754 64-bit binary, and does not have room for the wider exponent that can exist in a register. The same representation is used in memory regardless of the state of strictfp. – Patricia Shanahan Feb 01 '16 at 17:00
@PatriciaShanahan Yes. I do know that. The issue here is that an underflow you're talking about is only the product of an operation(such as addition, or multiplication). In other words, `float a=something; double b=a;` shall not suffer any underflow because there is no chance of involving an operation that might need the wider exponent. In conclusion, my question was, is your comments so far entirely talking about a conversion that involves an operation on two or more non-strictfp floats, which is not applicable to the case that I mentioned above. – eca2ed291a2f572f66f4a5fcf57511 Feb 01 '16 at 17:31
@Il-seobBae It all depends on where `something` is coming from. If it were an in-memory float the exponent would be certain to fit, and there would be no underflow. If `something` has been calculated in a register, and the assignment comes from the register, it may have an over-wide exponent. – Patricia Shanahan Feb 01 '16 at 18:49
@PatriciaShanahan Thank you for answering. It helped me a lot. – eca2ed291a2f572f66f4a5fcf57511 Feb 02 '16 at 16:33

Why is the conversion in non-strictfp mode considered as the one losing information?

1 Answers1