Determine smallest floating point type that can hold a string value

Question

I'm working on a method that translates a string into an appropriate Number type, depending upon the format of the number. If the number appears to be a floating point value, then I need to return the smallest type I can use without sacrificing precision (Float, Double or BigDecimal).

Based on How many significant digits have floats and doubles in java? (and other resources), I've learned than Float values have 23 bits for the mantissa. Based on this, I used the following method to return the bit length for a given value:

private static int getBitLengthOfSignificand(String integerPart,
    String fractionalPart) {
  return new BigInteger(integerPart + fractionalPart).bitLength();
}

If the result of this test is below 24, I return a Float. If below 53 I return a Double, otherwise a BigDecimal.

However, I'm confused by the result when I consider Float.MAX_VALUE, which is 3.4028235E38. The bit length of the significand is 26 according to my method (where integerPart = 3 and fractionalPart = 4028235. This triggers my method to return a Double, when clearly Float would suffice.

Can someone highlight the flaw in my thinking or implementation? Another idea I had was to convert the string to a BigDecimal and scale down using floatValue() and doubleValue(), testing for overflow (which is represented by infinite values). But that loses precision, so isn't appropriate for me.

possible duplicate of [Is floating point math broken?](http://stackoverflow.com/questions/588004/is-floating-point-math-broken) — tmyklebu, Sep 14 '14 at 21:02
@tmyklebu Can you explain how that is a duplicate? I've asked a very specific question about type conversion and you appear to have linked to a generic question about inaccuracies in floating point numbers. How are the two related? — Duncan Jones, Sep 15 '14 at 06:36
You cannot do this without "sacrificing precision." Both in the case of `3.4028235e38 = 0xFFFFFF2A6D7FC1BEF94AD08780000000` and in various decimal cases, the way you figure out what the precision loss is is by computing a binary expansion, making this fundamentally the same question as the other one. — tmyklebu, Sep 15 '14 at 12:00

Pascal Cuoq · Answer 1 · 2014-09-14T09:49:10.327

2

The significand is stored in binary, and you can think of it as a number in its decimal representation only if you don't let it confuse you.

The exponent is a binary exponent that does not represent a multiplication by a power of ten but by a power of two. For this reason, the E38 in the number you used as example is only a convenience: the real significand is in binary and should be multiplied by a power of two to obtain the actual number. Powers of two and powers of ten aren't the same, so “3.4028235” is not the real significand. The real significand of Float.MAX_VALUE is in hexadecimal notation, 0x1.fffffe, and its associated exponent is 127, meaning that Float.MAX_VALUE is actually 0x1.fffffe * 2¹²⁷.

Looking at the decimal representation to choose a binary floating-point type to put the value in, as you are trying to do, doesn't work. For one thing, the number of decimal digits that one is sure to recover from a float is different from the number of decimal digits one may need to write to distinguish a float from its neighbors (6 and 9 respectively). You chose to write “3.4028235E38” but you could have written 3.40282E38, which for your algorithm, looks easier to represent, when it isn't, really. When people write that “3.4028235E38” is the largest finite value of the float type, they mean that if you round this decimal number to float, you will arrive to the largest float. If you parse “3.4028235E38” as a double-precision number it won't even be equal to Float.MAX_VALUE.

To put it differently: another way to write Float.MAX_VALUE is 3.4028234663852885981170418348451692544E38. It is still representable as a float (it represents the exact same value as 3.4028235E38). It looks like it has many digits because these are decimal digits that appear for a decimal exponent, when in fact the number is represented internally with a binary exponent.

(By the way, your approach does not check that the exponent is in range to represent a number in the chosen type, which is another condition for a type to be able to represent the number from a string.)

edited Sep 14 '14 at 09:49

answered Sep 14 '14 at 09:31

Pascal Cuoq

79,187
7
161
281

Thank you for a detailed description of the underlying issues. Do you have any suggestions for a better approach to solving my problem? – Duncan Jones Sep 14 '14 at 11:55
@Duncan It seems to me that you are trying to translate the intention of the author of the string info a choice of format. If the author of the string wrote “0.1”, then perhaps he really only cares about this digit, and any value between 0.096 and 0.14 is acceptable to him. Then `float` is a good type to use. But perhaps the author meant “0.100000000000” and the trailing zeroes were omitted out of convention. In this case `float` is not appropriate but `double` still works. – Pascal Cuoq Sep 14 '14 at 12:01
@Duncan If this is what you are trying to do, then just do not test your function on 3.4028235E38, which is a special case: people who write this string mean exactly a certain `float` (the largest finite `float`), and this is why they write as many as 8 decimals. But they write the decimals of a representable `float`, on purpose, because the number they are translating to decimal is a `float`. It is normal for this number, with the number of decimals it has, without entering the details of its actual value, to look like it would need a `double` type to be contained accurately. – Pascal Cuoq Sep 14 '14 at 12:06
@Duncan 3.4028235E38 only **happens** to be close to one particular `float`, but another decimal representation with the same number of decimals would not have been. – Pascal Cuoq Sep 14 '14 at 12:06
Given that the input is a string, I suppose my goal is to ensure that this type of test would always pass: `assertEquals(inputString, myMethod(inputString).toString());` Is that even possible? If it is, do you have any practical advice on how to select the correct type to convert to? A naive approach might be to try successively larger types until a `toString()` test passes. – Duncan Jones Sep 15 '14 at 07:05
1

@Duncan If the goal is to have `assertEquals(inputString, myMethod(inputString).toString())`, the simplest way is to make that the test, first as `float`, then as `double`, and to use `BigDecimal` if neither binary floating-point type manages to satisfy the property. If you can afford the twice two conversions (millionth of seconds on a modern computer), this is the simplest. – Pascal Cuoq Sep 15 '14 at 07:48

Patricia Shanahan · Answer 2 · 2014-09-14T15:13:24.977

I would work in terms of the difference between the actual value and the nearest float. BigDecimal can store any finite length decimal fraction exactly and do arithmetic on it:

Convert the String to the nearest float x. If x is infinite, but the value has a finite double representation use that.

Convert the String exactly to BigDecimal y.

If y is zero, use float, which can represent zero exactly.

If not, convert the float x to BigDecimal, z.

Calculate, in BigDecimal to a reasonable number of decimal places, the absolute value of (y-z)/z. That is the relative rounding error due to using float. If it is small enough for your purposes, less than some value you pick, use float. If not, use double.

If you literally want no sacrifice in precision, it is much simpler. Convert to both float and double. Compare them for equality. The comparison will be done in double. If they compare equal, go with the float. If not, go with the double.

Regarding your final paragraph - I presume some input values will need BigDecimal to be represented accurately. How would you propose to differentiate between those three cases? — Duncan Jones, Sep 15 '14 at 07:23
Again, check for equality. If converting to either `float` or `double` then to `BigDecimal` gives the same numerical value as converting directly to `BigDecimal`, the float or double was exact. — Patricia Shanahan, Sep 15 '14 at 08:00

Determine smallest floating point type that can hold a string value

2 Answers2