Truncate a floating-point number to the leading N decimal digits

Question

Which is the most optimal way to get the n leftmost non-zero digits of a floating point number (number >= 0.0).

For example,

if n = 1:

0.014568 -> 0.01
0.246456 -> 0.2

if n = 2:

0.014568 -> 0.014
0.246456 -> 0.24

After @schil227 comment: Currently I am doing multiplications and divisions (by 10) as necessary in order to have the n digits at the decimal number field.

"Optimal" is already a superlative. "Most optimal" doesn't make sense. It's like "C++++". — Kerrek SB, Nov 15 '16 at 20:22
Your question indicates the leftmost N digits, assuming no rounding, correct? With rounding, it would become getting N+1 digits, then removing the last one and rounding the Nth one. Then, the question depends on how the number is represented. I would love to answer getting the N leftmost non-zero bits of the binary representation, for example... For C++, the assumption would be IEEE-754 format — Bruno Guardia, Nov 15 '16 at 20:24
@n.m. I need that because I am clustering values according to that — Angelos, Nov 15 '16 at 20:28
Try this: `printf ("%.2e", 0.0014321);` If this is not exactly what you need, you may get some ideas looking at the source code of `printf`. — n. m. could be an AI, Nov 15 '16 at 20:31
Throwing a basic idea out there: multiply the float by 10 until you've found a non-zero digit in the ones place (say x times), at which point you know you need x+n decimal points, which you could then truncate off the float (careful not to go out of bounds). — schil227, Nov 15 '16 at 20:43
You are asking for the number (not rounded) to `n` significant figures? Optimal code? Or ***any*** code, since there is none shown? I suggest you try a few different solutions yourself. If you have working code, ask on Stackreview, since your results seem to be correct. — Weather Vane, Nov 15 '16 at 20:54
I suppose that instead of multiplication you could check if your number is greater than other decimals; e.g. if(f > 0.1){ return 0;}else if (f > 0.01){ return 1;} ... not sure how much more efficient it would be, it's more or less traversing down the decimal. — schil227, Nov 15 '16 at 20:57
@WeatherVane I have tried some solutions my self, I just didn't showed any code here because I am mostly looking for an optimal algorithm or idea on that topic. I have tried solutions similar to snprintf like propose earlier and I am writing what I am currently doing. If you think is nessesary to see some code let me know. — Angelos, Nov 15 '16 at 20:59
I meant [Code review](http://codereview.stackexchange.com/) sorry. On this site it is always necessary to show your code. — Weather Vane, Nov 15 '16 at 21:01
@schil227 A challenge to "multiply the float by 10" is that the product can easily incur rounding and contaminate the desired result. If OP wants a high quality answer, this is a non-trivial code request to "roll-your-own". — chux - Reinstate Monica, Nov 15 '16 at 22:13
"optimal way" and "correct" may be in conflict here. Very accurate answers are hard as code will effectively repeat a lot of `sprintf()` code. Faster, less accurate code is easy or over a limit range. I assume code needs to be accurate/correct first over all FP range, speedy second. So the only real way to get _faster_ code is to specify what limitations you can live with: range of `number`, range of `n`, precision of result. So far the only limit posted is `number >= 0.0` and IEEE 754-2008, yet did not even specify _binary_ FP. — chux - Reinstate Monica, Nov 15 '16 at 22:23
" Currently I am doing multiplications and divisions (by 10) as necessary in order to have the n digits at the decimal number field." --> posting that code would help. — chux - Reinstate Monica, Nov 15 '16 at 22:24
Does this work? Use `pow(10, digits + nearbyint(log10(x)))` as a scale factor, then use `nearbyint()` to round the double to the nearest integer, then reverse the scaling. I'm not sure that would even be faster than printing to a string and truncating, since log() is not that cheap, and pow() isn't great either (especially since it takes a `double` exponent, not integer). Printing to a string is even more likely to be good if you actually want a string result and can avoid going back to `double`. (And if you don't, are you *sure* you want to throw away precision?) — Peter Cordes, Nov 17 '16 at 06:53

score 4 · Answer 1 · edited May 23 '17 at 10:30

4

Code could use sprintf(buf, "%e",...) to do most of the heavy lifting.

There are so many corner cases that other direct code may fail, sprintf() is likely to be, at least, as good solid reference solution.

This code prints the double to DBL_DECIMAL_DIG places to insure there is no rounding in digits that would make a difference. Then it zeros out various digits depending on n.

See @Mark Dickinson comment for reasons to use a greater value than DBL_DECIMAL_DIG. Perhaps on the order of DBL_DECIMAL_DIG*2. As mentioned above, there are many corner cases.

#include <float.h>
#include <math.h>
#include <stdio.h>

double foo(double x, int n) {
  if (!isfinite(x)) {
    return x;
  }
  printf("%g\n", x);
  char buf[DBL_DECIMAL_DIG + 11];
  sprintf(buf, "%+.*e", DBL_DECIMAL_DIG, x);
  //puts(buf);
  assert(n >= 1 && n <= DBL_DECIMAL_DIG + 1);
  memset(buf + 2 + n, '0', DBL_DECIMAL_DIG - n + 1);
  //puts(buf);
  char *endptr;
  x = strtod(buf, &endptr);
  printf("%g\n", x);
  return x;
}

int main() {
 foo(0.014568, 1);
 foo(0.246456, 1);
 foo(0.014568, 2);
 foo(0.246456, 2);
 return 0;
}

Output

This answer assumes OP does not want a rounded answer. Re: 0.246456 -> 0.24

edited May 23 '17 at 10:30

Community

1
1

answered Nov 15 '16 at 22:01

chux - Reinstate Monica

143,097
13
135
256

2

So even this won't cover *all* the corner cases, right? E.g., if `x = 0x1.1079848152a7ap-1` (exact value in decimal `0.5321771056999999860437355891917832195758819580078125`), then the first ten digits after the point are `0.5321771056`, but formatting to 15 places after the point and then truncating will give `0.5321771057`. – Mark Dickinson Nov 16 '16 at 09:02
@Mark Dickinson In your example, why did you "format to 15" which incurs its own rounding issue. This code does not use `15` to form the result. Better to use at at least `DBL_DECIMAL_DIG` total significant digits. Unable to replicate your findings. Please report the return value using `"%a"`. When I ran `foo(0x1.1079848152a7ap-1, 10)`, the return value was `0x1.1079848076c0ap-1` or `0.5321_7710_5599_9999_7777..`. or `0.5321_7710_56` when formated to 10 significant digits. Hmmm perhaps your code used `DBL_DIG` and not `DBL_DECIMAL_DIG`? – chux - Reinstate Monica Nov 16 '16 at 14:49
Ah, sorry, yes. But the problem exists with `DBL_DECIMAL_DIG`, too, just not with that particular example. The `sprintf` call rounds, and that that round can end up rounding a string of 9s up. I'll look for examples shortly. – Mark Dickinson Nov 16 '16 at 16:40
Updated example: with `x = 0x1.e529dcaae1d0ap-1` (decimal value `0.9475850065799999999427427610498853027820587158203125`), and assuming a `DBL_DECIMAL_DIG` of 17, which is fairly typical for IEEE 754 systems, the `sprintf` result is `"'9.47585006580000000e-01'"`, so a request for the first 11 digits after the point would get `0.94758500658` instead of the correct `0.94758500657`. I'm not sure whether the OP really cares about such corner cases, though. – Mark Dickinson Nov 16 '16 at 16:58
@MarkDickinson Agree, the precision of `DBL_DECIMAL_DIG` is insufficient in your [comment's case](http://stackoverflow.com/questions/40618967/floating-numbers/40620511?noredirect=1#comment68508010_40620511). This issue then becomes what is the maximum number of `9`s that could exist in a decimal-ization of a binary number. I suspect it is `DBL_DIG`, so the `sprintf("%.*e")` method may need `DBL_DECIMAL_DIG + DBL_DIG` significant digits. Yet once code attempts more than `DBL_DECIMAL_DIG` digits, the quality of `sprintf()` is suspect. C has no specs pass `DBL_DECIMAL_DIG`. Your thoughts? – chux - Reinstate Monica Nov 16 '16 at 17:20
Aha, I know the answer to that one! The maximum number of internal 9s, for IEEE 754 binary64, is 20: http://stackoverflow.com/a/20302431/270986. Agreed about `sprintf` being questionable beyond that point anyway, though on a good number of systems it should be using something like David Gay's correctly-rounded code. – Mark Dickinson Nov 16 '16 at 17:31
@MarkDickinson Nice. Yet for this task, the max 9s also depends on if it make a difference in the `double` answer unlike `0x1.c23142c9da581p-405` So code only need consider max 9s _directly_ after `n` digits `n` is likely somewhere in `DBL_DIG <= n <= DBL_DECIMAL_DIG` range and may vary some depending on `x`. BTW: Thanks for the good technical feedback. – chux - Reinstate Monica Nov 16 '16 at 17:39
Yes, you're right. That's the answer to the wrong question. :-( It's an interesting problem, that unfortunately I don't have the cycles to look at right now. – Mark Dickinson Nov 16 '16 at 18:22
Hmm, it might be more efficient to pass DBL_DECIMAL_DIG as part of the format string directly (with a stringify macro). But OTOH, then sprintf would have to atoi() it instead of having the integer value as an arg. Pretty small difference either way, I'm sure, compared to all the `double` division-by-10 that has to happen. – Peter Cordes Nov 17 '16 at 06:35
@PeterCordes Interesting thought. OTOH an optimizing compiler can analyze the format with `sprintf(buf, "%+.*e", DBL_DECIMAL_DIG, x);` and generate `internal_double_to_estr(buf, x, DBL_DECIMAL_DIG, '+')` or something equivalent that does not need to repeatedly re-parse the format string. This removes various inefficiencies. I doubt an advanced FP to string function does lots of `/10`. Good FP to decimal string to a well traveled and hard problem. Many details and corner case, hence I promote not re-doing that code lightly. – chux - Reinstate Monica Nov 17 '16 at 15:04
I've looked at [strtod (glibc vs. David Gay's)](http://www.exploringbinary.com/how-glibc-strtod-works/), but not the other direction. If not literally `/10`, probably still not cheap. Still, if you can copy an existing open-source implementation and tweak it to truncate to N digits instead of producing a much longer correctly-rounded result and truncating that, you could get a major speedup by converting only the digits you need. – Peter Cordes Nov 17 '16 at 19:22
Or if you don't need a string result at all, maybe just scale with `pow(10, digits + floor(log10(fabs(x))))`, `floor` that, then scale back. – Peter Cordes Nov 17 '16 at 19:23
@PeterCordes Suggested scaling works in general, yet has corner case issues when `x` is a power-of-10 at at sub-normal values. Suggest posting that as answer, suspect OP may prefer simplicity vs. accuracy. Doubt speed is the issue. – chux - Reinstate Monica Nov 17 '16 at 19:34
@chux: Since you didn't see any obvious problem with that idea either, I went and implemented it. It works about as well as you could hope for, I think (at least with glibc's highly-accurate libm functions). Answer posted. – Peter Cordes Nov 17 '16 at 21:35

score 3 · Answer 2 · edited May 23 '17 at 11:55

If you want the result as a string, you should probably print to a string with extra precision, then chop that off yourself. (See @chux's answer for details on how much extra precision you need for IEEE 64-bit double to avoid rounding up from a string of 9s, since you want truncation but all the usual to-string functions round to nearest.)

If you want a double result, then are you sure you really want this? Rounding / truncating early in the middle of a calculation usually just worsens the accuracy of the final result. Of course, there are uses in real algorithms for floor/ceil, trunc, and nearbyint, and this is just a scaled version of trunc.

If you just want a double, you can get fairly good results without ever going to a string. Use ndigits and floor(log10(fabs(x))) to work out a scale factor, then truncate the scaled value to an integer, then scale back.

Tested and working (with and without -ffast-math). See the asm on the Godbolt compiler explorer. This might run reasonably efficiently, especially with -ffast-math -msse4.1 (so floor and trunc can inline to roundsd).

If you care about speed, look into replacing pow() with something that takes advantage of the fact that the exponent is a small integer. I'm not sure how fast library pow() implementations are in that case. GNU C __builtin_powi(x, n) trades accuracy for speed, for integer exponents, doing a multiplication tree, which is less accurate than what pow() does.

#include <float.h>
#include <math.h>
#include <stdio.h>

double truncate_n_digits(double x, int digits)
{
    if (x==0 || !isfinite(x))
        return x;   // good idea stolen from Chux's answer :)

    double l10 = log10(fabs(x));
    double scale = pow(10.,  floor(l10) + (1 - digits));  // floor rounds towards -Inf
    double scaled = x / scale;
    double scaletrunc = trunc(scaled);  // trunc rounds towards zero
    double truncated = scaletrunc * scale;

#if 1    // debugging code
    printf("%2d %24.14g =>\t%24.14g\t scale=%g, scaled=%.30g\n", digits, x, truncated, scale, scaled);
    // print with more accuracy to reveal the real behaviour
    printf("   %24.20g =>\t%24.20g\n", x, truncated);
#endif

    return truncated;
}

test cases:

int main() {
 truncate_n_digits(0.014568, 1);
 truncate_n_digits(0.246456, 1);
 truncate_n_digits(0.014568, 2);
 truncate_n_digits(-0.246456, 2);
 truncate_n_digits(1234567, 2);
 truncate_n_digits(99999999999, 6);
 truncate_n_digits(-99999999999, 6);
 truncate_n_digits(99999, 10);
 truncate_n_digits(-0.0000000001234567, 3);
 truncate_n_digits(1000, 6);
 truncate_n_digits(0.001, 6);
 truncate_n_digits(1e-312, 2);  // denormal, and not exactly representable: 9.999...e-313
 truncate_n_digits(nextafter(1e-312, INFINITY), 2);  // denormal, just above 1.00000e-312
 return 0;
}

each result shown twice: first with only %.14g so rounding gives the string we want, then again with %.20g to show enough places to reveal the realities of floating point math. Most numbers are not exactly-representable, so even with perfect rounding it's impossible to return a double exactly represents the truncated decimal string. (Integers up to about the size of the mantissa are exactly representable, and so are fractions where the denominator is a power of 2.)

 1                 0.014568 =>                      0.01         scale=0.01, scaled=1.45679999999999987281285029894
    0.014567999999999999353 =>   0.010000000000000000208
 1                 0.246456 =>                       0.2         scale=0.1, scaled=2.46456000000000008398615136684
      0.2464560000000000084 =>     0.2000000000000000111
 2                 0.014568 =>                     0.014         scale=0.001, scaled=14.5679999999999996163069226895
    0.014567999999999999353 =>   0.014000000000000000291
 2                -0.246456 =>                     -0.24         scale=0.01, scaled=-24.6456000000000017280399333686
     -0.2464560000000000084 =>   -0.23999999999999999112
 3               1234.56789 =>                      1230         scale=10, scaled=123.456789000000000555701262783
       1234.567890000000034 =>                      1230
 6               1234.56789 =>                   1234.56         scale=0.01, scaled=123456.789000000004307366907597
       1234.567890000000034 =>     1234.5599999999999454
 6              99999999999 =>               99999900000         scale=100000, scaled=999999.999990000040270388126373
                99999999999 =>               99999900000
 6             -99999999999 =>              -99999900000         scale=100000, scaled=-999999.999990000040270388126373
               -99999999999 =>              -99999900000
10                    99999 =>                     99999         scale=1e-05, scaled=9999900000
                      99999 =>     99999.000000000014552
 3            -1.234567e-10 =>                 -1.23e-10         scale=1e-12, scaled=-123.456699999999983674570103176
   -1.234566999999999879e-10 => -1.2299999999999998884e-10
 6                     1000 =>                      1000         scale=0.01, scaled=100000
                       1000 =>                      1000
 6                    0.001 =>                     0.001         scale=1e-08, scaled=100000
   0.0010000000000000000208 =>  0.0010000000000000000208
 2     9.9999999999847e-313 =>      9.9999999996388e-313         scale=1e-314, scaled=100.000000003458453079474566039
   9.9999999999846534143e-313 =>        9.9999999996388074622e-313
 2     1.0000000000034e-312 =>      9.0000000001196e-313         scale=1e-313, scaled=9.9999999999011865980946822674
   1.0000000000034059979e-312 =>        9.0000000001195857973e-31

Since the result you want will often not be exactly representable, (and because of other rounding errors) the resulting double will sometimes be below the result you want, so printing it with full precision might give 1.19999999 instead of 1.20000011. You might want to use nextafter(result, copysign(INFINITY, original)) to get a result that's more likely to have a higher magnitude than what you want.

Of course, that could just make things worse in some cases. But since we truncate towards zero, most often we get a result that's just below (in magnitude) the unrepresentable exact value.

Nice, maybe someday I'll run a random `double` generator against the two methods to find differences and strengths/weaknesses in the solutions. What is really nifty is having > 1 Independent ways to check each other. BTW: suggest add zero test `if (x == 0 || !isfinite(x)) return x;` — chux - Reinstate Monica, Nov 17 '16 at 22:02
Update: perhaps `pow(0.1, ...)` could let us swap `x / scale` to `x * scale`. But `0.1` is not exactly representable as a binary float/double; `10.` is, so that could make things even worse. — Peter Cordes, Jun 12 '22 at 19:15

Chameleon · Answer 3 · 2022-06-15T19:05:01.113

Ok, another one like @Peter Cordes but more generic.

/** Return \c digits semantic digis of number \c x.
    \tparam T Type of number \c x can be floating point or integral.
    \param x The number.
    \param digits The requested number of semantic digits of number \c x.
    \return The number with only \c digits semantic digits of number \c x. */
template<typename T>
requires(std::integral<T> || std::floating_point<T>)
T roundn(T x, unsigned int digits)
{
    if (!x || !std::isfinite(x)) return x;
    typedef std::conditional_t<std::floating_point<T>, T, double> Tp;
    Tp mul = pow(10, floor(digits - log10(abs(x))));
    Tp y = round(x * mul) / mul;
    if constexpr (std::floating_point<T>) return y;
    else return round(y);
}

int main()
{
    cout << setprecision(100);
    cout << roundn(123.456789, 1) << "\n";
    cout << roundn(123.456789, 2) << "\n";
    cout << roundn(123.456789, 3) << "\n";
    cout << roundn(123.456789, 4) << "\n";
    cout << roundn(123.456789, 5) << "\n";
    cout << roundn(-123.456789, 1) << "\n";
    cout << roundn(-123.456789, 2) << "\n";
    cout << roundn(-123.456789, 3) << "\n";
    cout << roundn(-123.456789, 4) << "\n";
    cout << roundn(-123.456789, 5) << "\n";
    cout << roundn(-123.456789, 15) << "\n";
    cout << roundn(123456, 1) << "\n";
    cout << roundn(123456, 2) << "\n";
    cout << roundn(123456, 3) << "\n";
    cout << roundn(123456, 10) << "\n";
    cout << roundn(-123456, 1) << "\n";
    cout << roundn(-123456, 2) << "\n";
    cout << roundn(-123456, 3) << "\n";
    cout << roundn(-123456, 10) << "\n";
    cout << roundn(0.0123456789, 1) << "\n";
    cout << roundn(0.0123456789, 2) << "\n";
    cout << roundn(-0.0123456789, 1) << "\n";
    cout << roundn(-0.0123456789, 2) << "\n";
    return 0;
}

It returns

99.9999999999999857891452847979962825775146484375
120
123
123.5
123.4599999999999937472239253111183643341064453125
-99.9999999999999857891452847979962825775146484375
-120
-123
-123.5
-123.4599999999999937472239253111183643341064453125
-123.4567890000000005557012627832591533660888671875
100000
120000
123000
123456
-100000
-120000
-123000
-123456
0.01000000000000000020816681711721685132943093776702880859375
0.0120000000000000002498001805406602215953171253204345703125
-0.01000000000000000020816681711721685132943093776702880859375
-0.0120000000000000002498001805406602215953171253204345703125

Truncate a floating-point number to the leading N decimal digits

3 Answers3