0

I faced with a curious issue. Look at this simple code:

int main(int argc, char **argv) {
    char buf[1000];
    snprintf_l(buf, sizeof(buf), _LIBCPP_GET_C_LOCALE, "%.17f", 0.123e30f);
    std::cout << "WTF?: " << buf << std::endl;
}

The output looks quire wired:

123000004117574256822262431744.00000000000000000

My question is how it's implemented? Can someone show me the original code? I did not find it. Or maybe it's too complicated for me.

I've tried to reimplement the same transformation double to string with Java code but was failed. Even when I tried to get exponent and fraction parts separately and summarize fractions in cycle I always get zeros instead of these numbers "...822262431744". When I tried to continue summarizing fractions after the 23 bits (for float number) I faced with other issue - how many fractions I need to collect? Why the original code stops on left part and does not continue until the scale is end? So, I really do not understand the basic logic, how it implemented. I've tried to define really big numbers (e.g. 0.123e127f). And it generates huge number in decimal format. The number has much higher precision than float can be. Looks like this is an issue, because the string representation contains something which float number cannot.

Valera Dubrava
  • 245
  • 4
  • 12
  • Looks fine to me `0.123e30f` is a 30 digit number but `float` only has about 7 to 9 digits of precision so the rest are "invented". The format string `"%.17f"` specifies non-scientific notation with 17 digits after the decimal point. – Richard Critten Jul 07 '22 at 13:01
  • Java implementation gives me this number ```123000004117574260000000000000.00000000000000000```. – Valera Dubrava Jul 07 '22 at 13:08
  • So, I wanted to say that looks the tail of numbers (822262431744) is something which is not exists in real float. – Valera Dubrava Jul 07 '22 at 13:09
  • 1
    Anything after about here `123000004` does not exist in a `float`. All you can expect from a `float` is between 7 and 9 digits. Have a read of [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – Richard Critten Jul 07 '22 at 13:11
  • For me it does not seem right. As I've mentioned before, I splitted float to mantissa and fractions, and build the decimal number in Java. It was exact what java format gives me. The same number. And these leading zeros (before the dot) it case when you pass through 23 fractions and take the sum by 2^30. – Valera Dubrava Jul 07 '22 at 13:21
  • I see that you are trying to explain me how floating point numbers are maden. But I think the point is far from here. The question is how it transforms float to string with such precision? Does these numbers are fractions 1/2^(23 + ...)? – Valera Dubrava Jul 07 '22 at 13:29
  • 3
    The IEEE 754 `float` nearest to 0.123e30 *is* 123000004117574256822262431744. – molbdnilo Jul 07 '22 at 13:30
  • But actually the 0.123e30 represents everything halfway between prev(0.123e30) and 0.123e30 and halfway between 0.123e30 and next(0.123e30). Java might just be better at picking a nicer decimal number within that range than C. In C the float will be passed as double on common CPUs with variadic args so the C double-to-string function might not have a choice about adding more digits. The double has a more precision even if that is purely invented by the conversion from float. – Goswin von Brederlow Jul 07 '22 at 13:37
  • 1
    It is perhaps worth noting that the Java result is not representable as an IEEE 754 float either, which seems broken to me. – molbdnilo Jul 07 '22 at 14:05
  • @molbdnilo how did you get this value? So, my point is how build such number from a float value? – Valera Dubrava Jul 07 '22 at 14:24
  • @ValeraDubrava read up on how floating point numbers are stored in IEEE 754 format. Then have a read of [Exploring Binary on Floating Point](https://www.exploringbinary.com/tag/floating-point/) – Richard Critten Jul 07 '22 at 14:41
  • @RichardCritten, sorry but I do not get it - how does this thread helps me? – Valera Dubrava Oct 23 '22 at 14:03
  • @ValeraDubrava there have been many bugs in the language support libraries (C, Java etc) when converting floating point numbers to strings and in the other direction. The linked blog documents come of them and the other issues. These issues can explain why there are differences between the behaviour of different languages when representing floating point as strings. – Richard Critten Oct 23 '22 at 14:09

3 Answers3

0

Please read documentation:

printf, fprintf, sprintf, snprintf, printf_s, fprintf_s, sprintf_s, snprintf_s - cppreference.com

The format string consists of ordinary multibyte characters (except %), which are copied unchanged into the output stream, and conversion specifications. Each conversion specification has the following format:

  • introductory % character
  • ...
  • (optional) . followed by integer number or *, or neither that specifies precision of the conversion. In the case when * is used, the precision is specified by an additional argument of type int, which appears before the argument to be converted, but after the argument supplying minimum field width if one is supplied. If the value of this argument is negative, it is ignored. If neither a number nor * is used, the precision is taken as zero. See the table below for exact effects of precision.

....

Conversion Specifier Explanation Expected Argument Type
f F converts floating-point number to the decimal notation in the style [-]ddd.ddd. Precision specifies the exact number of digits to appear after the decimal point character. The default precision is 6. In the alternative implementation decimal point character is written even if no digits follow it. For infinity and not-a-number conversion style see notes. double

So with f you forced form ddd.ddd (no exponent) and with .17 you have forced to show 17 digits after decimal separator. With such big value printed outcome looks that odd.

Marek R
  • 32,568
  • 6
  • 55
  • 140
0

Finally I've found out what the difference between Java float -> decimal -> string convertation and c++ float -> string (decimal) convertation. I did not find the original source code, but I replicated the same code in Java to make it clear. I think the code explains everything:

    // the context size might be calculated properly by getting maximum 
    // float number (including exponent value) - its 40 + scale, 17 for me
    MathContext context = new MathContext(57, RoundingMode.HALF_UP);
    BigDecimal divisor = BigDecimal.valueOf(2);
    int tmp = Float.floatToRawIntBits(1.23e30f)
    boolean sign = tmp < 0;
    tmp <<= 1;
    // there might be NaN value, this code does not support it
    int exponent = (tmp >>> 24) - 127;
    tmp <<= 8;
    int mask = 1 << 23;
    int fraction = mask | (tmp >>> 9);
    // at this line we have all parts of the float: sign, exponent and fractions. Let's build mantissa
    BigDecimal mantissa = BigDecimal.ZERO;
    for (int i = 0; i < 24; i ++) {
        if ((fraction & mask) == mask) {
            // i'm not sure about speed, maybe division at each iteration might be faster than pow
            mantissa = mantissa.add(divisor.pow(-i, context));
        }
        mask >>>= 1;
    }

    // it was the core line where I was losing accuracy, because of context
    BigDecimal decimal = mantissa.multiply(divisor.pow(exponent, context), context);
    String str = decimal.setScale(17, RoundingMode.HALF_UP).toPlainString();
    // add minus manually, because java lost it if after the scale value become 0, C++ version of code doesn't do it
    if (sign) {
        str = "-" + str;
    }
    return str;

Maybe topic is useless. Who really need to have the same implementation like C++ has? But at least this code keeps all precision for float number comparing to the most popular way converting float to decimal string:

    return BigDecimal.valueOf(1.23e30f).setScale(17, RoundingMode.HALF_UP).toPlainString();
Valera Dubrava
  • 245
  • 4
  • 12
-1

The C++ implementation you are using uses the IEEE-754 binary32 format for float. In this format, the closet representable value to 0.123•1030 is 123,000,004,117,574,256,822,262,431,744, which is represented in the binary32 format as +13,023,132•273. So 0.123e30f in the source code yields the number 123,000,004,117,574,256,822,262,431,744. (Because the number is represented as +13,023,132•273, we know its value is that exactly, which is 123,000,004,117,574,256,822,262,431,744, even though the digits “123000004117574256822262431744” are not stored directly.)

Then, when you format it with %.17f, your C++ implementation prints the exact value faithfully, yielding “123000004117574256822262431744.00000000000000000”. This accuracy is not required by the C++ standard, and some C++ implementations will not do the conversion exactly.

The Java specification also does not require formatting of floating-point values to be exact, at least in some formatting operations. (I am going from memory and some supposition here; I do not have a citation at hand.) It allows, perhaps even requires, that only a certain number of correct digits be produced, after which zeros are used if needed for positioning relative to the decimal point or for the requested format.

The number has much higher precision than float can be.

For any value represented in the float format, that value has infinite precision. The number +13,023,132•273 is exactly +13,023,132•273, which is exactly 123,000,004,117,574,256,822,262,431,744, to infinite precision. The precision the format has for representing numbers affects only which numbers it can represent, not how precisely it represents the numbers that it does represent.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • `binary32` representation of 123,000,004,117,574,256,822,262,431,744 is 1.552478313446045×2⁹⁶ and that corresponds with reality https://godbolt.org/z/39q8Eo1MM Your claim that the same number is represented in `binary32` as +13,023,132×2⁷³ has no basis in reality. The standards merely don't prohibit other formats. – Maxim Egorushkin Jul 09 '22 at 03:28
  • Here is an up-to-date overview of IEEE 754 formats from Nvidia https://docs.nvidia.com/cuda/floating-point/index.html#formats – Maxim Egorushkin Jul 09 '22 at 03:50
  • IEEE 754-2019 clause 3.3 says we may treat the significand as an integer: “to view the significand as an integer… floating-point numbers of the form (-1)^s×b^q×c, where… c is a number represented by a digit string of the form d_0 d_1 d_2… d_(p-1) where d_i is an integer digit 0≤d_i – Eric Postpischil Jul 09 '22 at 09:25
  • You are confusing _view_ with _binary representation_. – Maxim Egorushkin Jul 09 '22 at 18:25
  • No, I am not. See the full text of the cited passage. – Eric Postpischil Jul 09 '22 at 18:35