3

Let's say I want to write a function that does the following:

Given a number N,
when N is rounded to D1 digits
does the result include more than D2 decimal places, not counting trailing zeroes?

For example, say N is .01001, D1 is 4, and D2 is 2. The question becomes, does .0100 include more than 2 decimal places, not counting trailing zeroes? And the answer is "no". But if N was .00101, the answer would be "yes".

I'm looking for an efficient way to do this using standard C library functions, taking into account the limitations of floating-point numbers.

An example of my intended usage: show a number using four digits if necessary, but otherwise show it using two digits.

(This is not a homework question -- this is a result of being a professional programmer who didn't do these kinds of homework questions when he was a student.)

JW.
  • 50,691
  • 36
  • 115
  • 143
  • This is actually an interesting and difficult problem. If it were homework, I wonder what kind of answer an instructor would be looking for. The only correct answer I could envision CS students being able to come up with is the one I gave. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 00:46

5 Answers5

4

The only easy way to do this with the standard library is to use snprintf (or just sprintf) with the right format, and then count the zeros yourself. Correctly computing the decimal representation of a (binary) floating point number is a very very difficult task that you don't want to try to do yourself, and you have near-zero chance of writing a correct version that's faster than your standard library one.

I hope I got this right; it's untested:

double n; /* the number N */
int d1, d2; /* the parameters d1 and d2 */
char s[MAXLEN], *z;
snprintf(s, sizeof s, "%.*f", d1, n);
for (z=s+strlen(s)-1; *z==0; z--);
if (strlen(++z)<d1-d2) puts("yes");
else puts("no");

Edit: As noted by ssianky, snprintf may have limitations on the precision it prints. Actually the C standard allows pretty much any floating point operation to give the wrong result for no reason whatsoever as long as the implementation documents as such, but IEEE behavior is encouraged, and POSIX additionally requires correctly rounded results up to DECIMAL_DIG places, but allows implementations to print nonsense (for example all zeros) after printing sufficiently many places to uniquely determine the actual floating point value. So to make a long story short, if d1 is reasonably large, or if your platform is pathological, snprintf-based approaches might not give the right answer. (In practice they'll give the right answer on GNU systems and the wrong answer on Microsoft systems.)

If you care about this shortcoming and want correct results for any value of d1, you'll have to implement exact float-to-decimal code yourself. Loops that involve multiplying a floating point value repeatedly by 10 will not suffice for this.

Edit 2: Looking at the OP's "intended usage", using snprintf seems like a no-brainer. If you want to print the value to begin with and are just trying to decide how many decimal places to use, just print it to a string and then chop off the trailing zeros before displaying it. In fact, the %g printf format specifier might even do what the OP wants already...

Community
  • 1
  • 1
R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • Thanks -- I've enjoyed reading the answers and discussion. I was expecting the solution to be a mathematical one, but I think you're right that chopping off the zeros is the best approach. – JW. Sep 26 '10 at 02:13
1

You can first multiply your number by 10^D1 and round it to the nearest integer, then check if it's divisible by 10^D2. There are a handful of rounding functions to choose from that will round up/down/away from zero/etc., so be sure to check that you're using the one you want.

The function below is written so that it is small and self-contained, but a production implementation should use a lookup table for the powers of ten as indicated in the comment lines. This will yield faster performance and avoid error accumulating in the floating-point product.

int f (double N, unsigned int D1, unsigned int D2)
{
   int i, n_mult_round, ten_d2;
   /* instead of the loop below, a real implementation should use */
   /* N *= dbl_powers_of_ten[D1]; */
   /* where dbl_powers_of_ten is a double array containing powers of ten */
   for (i = 0; i < D1; ++i) {
      N *= 10.0;
   }
   n_mult_round = (int) round (N);
   /* instead of the loop below, a real implementation should use */
   /* ten_d2 = int_powers_of_ten[D2]; */
   /* where int_powers_of_ten is an int array containing powers of ten */
   ten_d2 = 1;
   for (i = 0; i < D2; ++i) {
      ten_d2 *= 10;
   }
   if ( n_mult_round % ten_d2 == 0 ) {
      return ( 1 );
   }
   else {
      return ( 0 );
   }
}
Philip Starhill
  • 205
  • 1
  • 5
  • Multiplication by powers of 10 is a lossy operation for floating point numbers - multiplication by `10^n` destroys `2*n` bits of precision. This could very well cause incorrect results. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 00:43
  • 2
    @R.. Are you sure? That implies that multiplying a 32-bit float by 10^16 gives complete garbage. – Philip Starhill Sep 22 '10 at 00:57
  • @Philip: Try `(1.0+DBL_EPSILON)*10.0` - make sure the result gets stored to memory and read back so you don't get excess precision, though. You should get `10.0`. Then try multiplying some numbers like `1.0000000123` by large powers of 10 and see what happens. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 01:04
  • @R.. He's not disputing whether multiplication causes a loss of precision. However, your claim as to the 'rate' of loss is ludicrous! Multiplying by a power of 10 primarily affects the exponent. The only reason there is any loss at all is because of the imprecise conversion between base-10, and base-2 for the mantissa. – Disillusioned Sep 22 '10 at 05:20
  • +1 Given the intent of the question (to decide on 2 vs 4 decimal representation), this solution should be perfectly acceptable. – Disillusioned Sep 22 '10 at 05:24
  • @Craig: I did mis-state the rate of loss, but it's probably supposed to be `((mant_bits-2)/mant_bits)^n`. or similar. No idea what you're talking about regarding conversion of base for the mantissa. No base conversion happens when you multiply; multiplying by 10 is (conceptually) shifting the mantissa left 2 bits, adding that to the original mantissa (this addition is where bits of precision are lost), and incrementing the exponent. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:27
  • However, cumulative rounding error could be reduced by first determining the power of 10, and only multiplying by N once. – Disillusioned Sep 22 '10 at 05:31
  • Well if the power of 10 is too big, it will suffer from rounding itself, before you even go multiplying it by `N`. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:42
  • Thanks for the comments. I added some comments to the function and modified the description to point out the problem of error accumulation in repeated multiplication by 10.0. – Philip Starhill Sep 22 '10 at 17:30
0
  1. Format it as a string to the max precision.
  2. Remove zeros from the right until you reach the min precision.

Here's an unsafe implementation to get you started:

char buff[20]; /* make as big as necessary */
int maxPrecision = 4;
int minPrecision = 2;

sprintf(buff, "%.*f", maxPrecision, myFloat);

char *p = buff + (strlen(buff) - 1);
char *punct = strchr(buff, ".");

/* remove trailing zeros until we reach minPrecision */
while (*p == '0' && p > (punct + minPrecision))
    *p-- = 0; 
egrunin
  • 24,650
  • 8
  • 50
  • 93
0

What about this? fmod should return the decimals after the specified number of decimal places, so we can compare those two numbers to see if they're equal.

int needsD1Decimals(float N, int D1, int D2) {
    float pow1 = pow(0.1f, D1); // 0.0001
    float pow2 = pow(0.1f, D2); // 0.01
    float mod1 = fmod(N, pow1); // 0.00001
    float mod2 = fmod(N, pow2); // 0.00001
    if (fabs(mod1 - mod2) > (pow1 / 2)) { // > 0.00005 to handle errors
        return 1;
    }
    return 0;
}

If you're just wanting to print out the correct answer, it might be faster to just print it with the most decimal places and truncate the zeros at the end:

void printNumber(float N, int D1, int D2) {
    char format[256];
    char result[256];
    sprintf(format, "%%.%df", D1);
    sprintf(result, format, N);
    char *end = result + strlen(result) - 1;
    int zeros = 0;
    while (end > result && end[0] == '0' && zeros < (D1 - D2))
    {
        zeros++;
        end--;
    }
    if (zeros >= (D1 - D2))
    {
        end[1] = '\0';
    }
    puts(result);
}

void doNumber(float N, int D1, int D2) {
    printf("%f, %d, %d: ", N, D1, D2);
    printNumber(N, D1, D2);
    printf("\n");
}

int _tmain(int argc, _TCHAR* argv[])
{
    doNumber(0.01001f, 4, 2);
    doNumber(0.010101f, 4, 2);
    doNumber(0.011001f, 4, 2);
    doNumber(5000.0f, 4, 2);
    return 0;
}
Jason Goemaat
  • 28,692
  • 15
  • 86
  • 113
-1
int foo(float number, int d1, int d2)
{
    assert((number > 0.0f) && (number < 1.0f) && (d1 <= FLT_DIG) && (d2 < d1) && (d2 > 0));

    int count = d1;
    int n = (int)(number * pow(10.0f, d1));

    if (n == 0) return 0;

    while ((n % 10) == 0)
    {
        n /= 10;
        count--;
    }

    return (count > d2) ? 1 : 0;
}
ssianky
  • 97
  • 7
  • -1 for hopelessly wrong answer and ignorance of floating point – R.. GitHub STOP HELPING ICE Sep 22 '10 at 00:28
  • And another -1 if I could give it for writing code that's not even valid C: `int(number)`...??? – R.. GitHub STOP HELPING ICE Sep 22 '10 at 00:45
  • What output does your program give? (Once you fix the syntax errors...) It gives 25 on my machine. Hint: you've completely ignored the parameter `d1` in the statement of the problem. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 00:49
  • 1) I fink you are who are totally ignorant about floating numbers. Multiplying it with 10 will not change fraction part, but only exponent. And yes, int(n) is not valid C cast, but (int)n is. – ssianky Sep 22 '10 at 00:52
  • @R.. : `int(number)` is valid C++. It has the same semantics as C's `(int) number` cast. But briefly slipping into the wrong language is the least of this answer's problems. – kenm Sep 22 '10 at 00:54
  • Floating point numbers are binary, not decimal, and have binary exponents. Multiplying by 10 shifts the mantissa 2 bits left, adds that to the original mantissa, and increments the exponent by 1. Have you even tried running your code? – R.. GitHub STOP HELPING ICE Sep 22 '10 at 00:55
  • Multiplying by 10 does't shifts nothing. It just adds 1 to mantissa. Problem is in "number -= int(number);", and I know why. – ssianky Sep 22 '10 at 01:03
  • @ssianky: What you're missing is that there's no such floating point number as 0.00101. The initial value of `number` in your code is 0.0010100000000000000470457006684910084004513919353485107421875. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 01:10
  • Well, first try is incorrect, but the KEY is efficiency. So, using the main idea from that try, I did a more efficient and I think correct algorithm... – ssianky Sep 22 '10 at 02:18
  • It's still not correct. It will definitely give the wrong answer for numbers of the form `1.0+epsilon` where `epsilon` is small; see my comments to Philip's answer. I'm quite sure you will not come up with a correct (for all inputs) solution which outperforms `sprintf` because converting floating point numbers to decimal is a highly nontrivial problem. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 03:49
  • d1 <= FLT_DIG assertion guaranties correct solution. sptrinf will have same problem for a number of digits bigger than FLT_DIG/DBL_DGT – ssianky Sep 22 '10 at 03:58
  • try d1 = 40, n = 0.00101f in your code and look at value of s – ssianky Sep 22 '10 at 04:05
  • The POSIX condition for when `sprintf` can fail to print the exact value is if `d1>DECIMAL_DIG`... In plain C, pretty much any floating point operation can give the wrong result for any reason the implementor likes. Your code should fail much sooner. If you want correct results for any value of `d1`, you basically need to re-implement `sprintf`'s floating point printing, but with the added stipulation that it works even for large precisions greater than `DECIMAL_DIG`. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 04:26
  • By the way, the GNU libc version of `printf` **does** work for any value of `d1`. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 04:28
  • Try your code with `n=1e-30` and `d1=39`. This is well within the range where any reasonable `printf` should give the right answer (9 significant places) but your code gives the wrong answer (it computes `count` as 39 rather than 30). Why? Because `pow(10.0,39)` is inexact and suffers from loss of precision. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 04:53
  • My code will fail at the same point where sprintf will do. Difference is just in precision - I used float32, but sptrintf uses float64. Is not hard to extend my solution to doubles too. Even to float128, if compiler supports it. – ssianky Sep 22 '10 at 04:54
  • Nope. `printf` fails either not-at-all (good implementations like glibc) or after a certain number of **significant** digits (on most other systems). Your code fails after a fixed number of places after the decimal point, possibly before reaching **any** significant digits. That's a big failure. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 04:57
  • You CAN'T use d1=39 in my code, because of assertion (d1 <= FLT_DIG). FLT_DIG == 6 – ssianky Sep 22 '10 at 04:57
  • sprintf will fail at DBL_DIG == 15 instead. Like I said, it's just a better precision. – ssianky Sep 22 '10 at 05:00
  • As I have said many times, the `sprintf` implementation works for **any** value of `d1` that fits in your string buffer, **if** you have a good underlying `printf` like the GNU one. Even if not, it never fails at a fixed number of places after the decimal point, only after a number of significant figures, and `n=1e-30` with `d1=39` will work just fine. So will `n=1e-1000` with `d1=1009`. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:09
  • Oh, and I believe you're right that your code works as long as the `assert` conditions are met (though I haven't checked it in detail), but the strictness of the conditions limits the usefulness a great deal... – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:12
  • And I repeat, your code for "double n = 0.00101f; int d1 = 20; int d2 = 17;" say "yes", but should say "no" – ssianky Sep 22 '10 at 05:15
  • It says yes **just like it should**. `n` to 20 places after the decimal **is** `0.00101000000000000000003537210` and as you can see that's 19 places (the final 20'th place happens to be a 0, as a result of rounding). – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:20
  • It says yes because it is WRONG. Doubles have only 15 decimal digits of precision, the ending "3537210" is just the trash that sprintf added. – ssianky Sep 22 '10 at 05:27
  • It's not trash. It's the value of the number. Doubles do not "have" decimal digits; they have binary digits, and those binary digits have an exact decimal expression which is much longer than the number of decimal digits needed to accurately serialize and deserialize the double as a decimal string. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:30
  • If you don't believe me, read a hexdump of the value `0.00101` as a double, then get out a good arbitrary-precision calculator and add up the powers of 2 from the binary. After rounding it to 20 places after the decimal point, you'll get exactly what I wrote above. It is **not junk**. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:32
  • It IS junk. "With the 52 bits of the fraction significand appearing in the memory format, the total precision is therefore 53 bits (approximately 16 decimal digits, \log_{10}(2^{53}) \approx 15.955)" – ssianky Sep 22 '10 at 05:36
  • Read what I said. I never said you can "encode" any number of decimal digits in a floating point representation. I said floating point numbers have exact decimal representations, representations which happen to take significantly more than the value of `DBL_DIG` digits. Your idea that a floating point number "encodes" a decimal value is backwards. It encodes a sum of powers of 2 whose exponents are in a fixed absolute range and fixed ranges relative to one another. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 05:47
  • And you read what I said. sprintf fills digits starting from 16-st with junk, but not with zeros how it should be. – ssianky Sep 22 '10 at 05:56
  • I suspect it just multiplies with 10, how I did first time. – ssianky Sep 22 '10 at 05:59
  • Rather than "suspecting", why don't you do some computations? Here are some to try: `1.0+DBL_EPSILON`. Printing that with `DBL_DIG` (15) places shows `1.000000000000000`, so obviously `DBL_DIG` is not sufficient precision. Now compute `1 + 2^-52` with an arbitrary-precision calculator and see what you get. If your `printf` is good, you'll get the same result using `%.52f`. By the way, if you try that "multiply by 10" thing, you'll quickly see that it gives the wrong results. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 06:06
  • Ok, you said above "0.00101f == 0.00101000000000000000003537210". My compiler says that "0.00101f == "0.00101000000722706320". So, it is undefined behavior. Your code is compiler specific and WRONG. – ssianky Sep 22 '10 at 06:17
  • I can suppose, that your compiler works internally with more precise real numbers. – ssianky Sep 22 '10 at 06:21
  • I already told you C makes basically no guarantees about floating point accuracy. This does not mean it has UB; in this case the behavior is implementation-defined, and good implementations will define it as always giving the right answer...which you still should go and work out. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 06:22
  • Actually the value I gave you was for `0.00101`, and your result is for `0.00101f`. Apart from being truncated to just 20 places, both are accurate. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 06:25
  • Ok, if no guaranty, than assertions should be done. I did, and my code workы for given limits and no more. Your code gives different results on different compilers. – ssianky Sep 22 '10 at 06:26
  • Any floating point code can give different results on different compilers. One valid implementation of floating point is for all floating point expressions to evaluate to 0.0. :-) – R.. GitHub STOP HELPING ICE Sep 22 '10 at 06:27
  • for 0.00101, my compiler works wright and shows "0.001010000000000000000000000000000000000000000000000000000000". No trash. – ssianky Sep 22 '10 at 06:27
  • Those zeros are trash. They differ from the actual value of the expression. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 06:29
  • No, zeros are correct. Compiler or sprintf implementation just excluded inaccurate digits, as I did in my solution – ssianky Sep 22 '10 at 06:31
  • Try `printf("%a\n", 0.00101);` and see what it shows. This is guaranteed to be exact. Then try to reconcile that with the nonsensical zeros you got printing it in decimal (hint: sum the powers of 2 by hand if you need to). Also do the rest of the computations I suggested and you just **might** learn something about floating point. – R.. GitHub STOP HELPING ICE Sep 22 '10 at 06:34
  • Hint, You can see those bits just by see memory on &n address in the debugger. I worked with floating point and fixed point numbers and know enough about their binary representation. – ssianky Sep 22 '10 at 06:43