8

Our existing application reads some floating point numbers from a file. The numbers are written there by some other application (let's call it Application B). The format of this file was fixed long time ago (and we cannot change it). In this file all the floating point numbers are saved as floats in binary representation (4 bytes in the file).

In our program as soon as we read the data we convert the floats to doubles and use doubles for all calculations because the calculations are quite extensive and we are concerned with the spread of rounding errors.

We noticed that when we convert floats via decimal (see the code below) we are getting more precise results than when we convert directly. Note: Application B also uses doubles internally and only writes them into the file as floats. Let's say Application B had the number 0.012 written to file as float. If we convert it after reading to decimal and then to double we get exactly 0.012, if we convert it directly, we get 0.0120000001043081.

This can be reproduced without reading from a file - with just an assignment:

    float readFromFile = 0.012f;
    Console.WriteLine("Read from file: " + readFromFile); 
    //prints 0.012

    double forUse = readFromFile;
    Console.WriteLine("Converted to double directly: " + forUse);
    //prints 0.0120000001043081

    double forUse1 = (double)Convert.ToDecimal(readFromFile);
    Console.WriteLine("Converted to double via decimal: " + forUse1); 
    //prints 0.012

Is it always beneficial to convert from float to double via decimal, and if not, under what conditions is it beneficial?

EDIT: Application B can obtain the values which it saves in two ways:

  1. Value can be a result of calculations
  2. Value can be typed in by user as a decimal fraction (so in the example above the user had typed 0.012 into an edit box and it got converted to double, then saved to float)
farfareast
  • 2,179
  • 1
  • 23
  • 22
  • what happens when you don't convert at all and just write the float as is? – Sam I am says Reinstate Monica Nov 05 '12 at 22:15
  • @SamIam: Upon *writing* them? Wouldn't change much. The problem comes with you do lots of math with them. Every operation adds a little bit of rounding error, til your numbers are too far from the real thing to be useful. (Some cases take longer than others to exhibit such behavior.) Just so it's clear, this happens with doubles too; it just takes a little bit longer for the errors to get big enough to throw off calculations. It's quite easy to notice with comparisons, though. – cHao Nov 05 '12 at 22:18
  • 1
    Why not just work with `decimal`s? – Bobson Nov 05 '12 at 22:27
  • @SamIam: I added the line of code to the post to address this. – farfareast Nov 05 '12 at 22:28

3 Answers3

13

we get exactly 0.012

No you don't. Neither float nor double can represent 3/250 exactly. What you do get is a value that is rendered by the string formatter Double.ToString() as "0.012". But this happens because the formatter doesn't display the exact value.

Going through decimal is causing rounding. It is likely much faster (not to mention easier to understand) to just use Math.Round with the rounding parameters you want. If what you care about is the number of significant digits, see:


For what it's worth, 0.012f (which means the 32-bit IEEE-754 value nearest to 0.012) is exactly

0x3C449BA6

or

0.012000000104308128

and this is exactly representable as a System.Decimal. But Convert.ToDecimal(0.012f) won't give you that exact value -- per the documentation there is a rounding step.

The Decimal value returned by this method contains a maximum of seven significant digits. If the value parameter contains more than seven significant digits, it is rounded using rounding to nearest.

Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • But why would rounding to a more "coarse" value occur when `decimal` has much better precision? – Jon Nov 05 '12 at 22:19
  • 1
    @Jon: The set of values representable by `decimal` are not a superset of those representable by `double`. Take `eps = double.Epsilon` and compare `(decimal)eps`, `(decimal)(2 * eps)` and `(decimal)(3 * eps)` and then tell me that `decimal` has much better precision. – Ben Voigt Nov 05 '12 at 22:21
  • @Jon: Here, I did it for you: http://ideone.com/TQJklg Obviously rounding can occur when converting a float to `decimal`. – Ben Voigt Nov 05 '12 at 22:28
  • Sure, but I expected something closer to the `double`. Also, consider [this](http://msdn.microsoft.com/en-us/library/yht2cx7b.aspx): *When you convert `float` or `double` to `decimal`, the source value is converted to decimal representation and rounded to the nearest number after the 28th decimal place if required.* So I think that `0.012f` in decimal representation is so close to 0.012 that 28 digits of precision aren't enough to disambiguate. I 'd comfortably call that "exactly" 0.012. I don't think it's the *formatter* that causes this. – Jon Nov 05 '12 at 22:28
  • @Jon: Actually, it ought to be the nearest number representable by single-precision floating point. And that's what `(decimal)value` would do, according to the page you found. But the `Convert` class operates differently. Check out the documentation for [`Convert.ToDecimal(Single)`](http://msdn.microsoft.com/en-us/library/he38a8ca.aspx), which tells you that there is rounding applied. – Ben Voigt Nov 05 '12 at 22:30
  • Actually, the cast behaves like `Convert.ToDecimal`(!). So the one I linked to is... misleading. It does warn about loss of precision, but if that part about the 28th digit is not a red herring, I don't know what is. Link: http://ideone.com/UMwfz3 – Jon Nov 05 '12 at 22:34
  • @Jon: It's not misleading, it just doesn't apply. First line: "Explicit numeric conversion ... using a cast expression". `Convert.ToDouble` is not a cast expression. – Ben Voigt Nov 05 '12 at 22:35
  • @BenVoigt: I agree with you. Saying "get exactly" I meant get exactly on the print. But what about the second part of the question: when is it useful to use decimal. Or you are saying it is not beneficial at all in no circumstances? – farfareast Nov 05 '12 at 22:36
  • @Jon: I'm about to check on genuine .NET, in case it's a Mono bug in casting. – Ben Voigt Nov 05 '12 at 22:36
  • Checked in .NET first. Same result. – Jon Nov 05 '12 at 22:36
  • @farfareast: It's not beneficial to use `decimal`. If you want the rounding that `Convert.ToDouble(Single)` does, just round (with `Math.Round`). – Ben Voigt Nov 05 '12 at 22:37
  • @Jon: Well, that's interesting. Looks like a bug. Someone should check what the C# specification says. – Ben Voigt Nov 05 '12 at 22:40
  • @BenVoigt: Let's define "beneficial" as "getting a closer value to the original". If we take as the original the value 0.012 in the first line of my code, then the `double forUse1` (rendered in print as 0.012) is surely closer to original than `double forUse` (rendered in print as 0.0120000001043081). :-) – farfareast Nov 05 '12 at 22:49
  • @farfareast: How many times do I have to say: "Use `Math.Round`"? It will be faster and much clearer why you are using it. – Ben Voigt Nov 05 '12 at 22:55
  • @BenVoigt - I'd clarify that it's never beneficial to convert *through* `decimal`. Using it as your variable type in the first place, throughout your code, is not a bad thing. – Bobson Nov 05 '12 at 22:56
  • @Bobson: He can't, the question says "The format of the file is fixed and we can't change it." The data comes in as IEEE 754 single precision values. – Ben Voigt Nov 05 '12 at 22:57
  • @Ben - Ah, missed that they were stored as binary in the file. If he was reading the number in directly, he could read them in as whatever datatype he liked. – Bobson Nov 05 '12 at 22:59
  • @Jon: Well, that language comes directly from the C# Specification. Looks like a bug in the implementation of cast-to-`decimal`. – Ben Voigt Nov 05 '12 at 23:04
  • @Jon: Typing up the bug report now. – Ben Voigt Nov 05 '12 at 23:09
  • @BenVoigt: Nice :) -- could just be a doc bug though. "Implementation bug" is perhaps too hasty. – Jon Nov 05 '12 at 23:10
  • 1
    @BenVoigt: It is an interesting conversation we are having :-), but I cannot use just Math.Round because the second parameter _digits_ specifies how many digits after the dot I want to preserve, but I probably would like to preserve 5 significant digits f.e 1.2345e-30. Another point is that some values coming from Application B were originally typed in by (human) users through UI. – farfareast Nov 05 '12 at 23:19
  • @Jon: No, the documentation matches the spec, word for word. The implementation does not match the spec. The spec is the definition of what is correct. – Ben Voigt Nov 05 '12 at 23:20
  • https://connect.microsoft.com/VisualStudio/feedback/details/770138/c-cast-from-system-single-to-system-decimal-violates-specification – Ben Voigt Nov 05 '12 at 23:22
  • @farfareast: Calculating the rounding position to pass to `Math.Round` is still going to be faster than calling `Convert.ToDecimal()`. Conversion to decimal calculates the rounding position, and then it does a *lot* more work as well. – Ben Voigt Nov 05 '12 at 23:25
  • @BenVoigt: With all due respect I still disagree with your answer. The post about rounding to N significant digits you reference suggests using Log and Power which complicates the code and is probably slow. Also in addition to the general case (a double converted to float and back to double) I am specifically interested in the case when the value was initially typed in by user (user types decimal representation->double->float->double). I will edit my question to stress that. – farfareast Nov 06 '12 at 01:53
  • @farfareast: The original representation has been lost. And even more information was lost when the data was stored in the file. **You cannot recover it without additional information.** `Log` and `Power` may be slow, but you just don't seem to get it. **The `ToDecimal` call also has to calculate the round-off.** Therefore the fastest rounding algorithm is certainly faster than calling `ToDecimal`. Now, calling `Log` may very well not be the fastest method. A binary search for the correct scaling factor is probably better. – Ben Voigt Nov 06 '12 at 14:07
  • @BenVoigt: Well, there is some **additional information** that is that certain numbers were typed in by users (as decimals). – farfareast Nov 07 '12 at 01:20
  • @farfareast: That's not useful information. What would be useful information is knowing that the original program only provided a blank x digits long for certain data. You haven't said. – Ben Voigt Nov 07 '12 at 03:11
2

As strange as it may seem, conversion via decimal (with Convert.ToDecimal(float)) may be beneficial in some circumstances.

It will improve the precision if it is known that the original numbers were provided by users in decimal representation and users typed no more than 7 significant digits.

To prove it I wrote a small program (see below). Here is the explanation:

As you recall from the OP this is the sequence of steps:

  1. Application B has doubles coming from two sources: (a) results of calculations; (b) converted from user-typed decimal numbers.
  2. Application B writes its doubles as floats into the file - effectively doing binary rounding from 52 binary digits (IEEE 754 single) to the 23 binary digits (IEEE 754 double).
  3. Our Application reads that float and converts it to a double by one of two ways:

    (a) direct assignment to double - effectively padding a 23-bit number to a 52-bit number with binary zeros (29 zero-bits);

    (b) via conversion to decimal with (double)Convert.ToDecimal(float).

As Ben Voigt properly noticed Convert.ToDecimal(float) (see MSDN in the Remark section) rounds the result to 7 significant decimal digits. In Wikipedia's IEEE 754 article about Single we can read that precision is 24 bits - equivalent to log10(pow(2,24)) ≈ 7.225 decimal digits. So, when we do the conversion to decimal we lose that 0.225 of a decimal digit.

So, in the generic case, when there is no additional information about doubles, the conversion to decimal will in most cases make us loose some precision.

But (!) if there is the additional knowledge that originally (before being written to a file as floats) the doubles were decimals with no more than 7 digits, the rounding errors introduced in decimal rounding (step 3(b) above) will compensate the rounding errors introduced with the binary rounding (in step 2. above).

In the program to prove the statement for the generic case I randomly generate doubles, then cast it to float, then convert it back to double (a) directly, (b) via decimal, then I measure the distance between the original double and the double (a) and double (b). If the double(a) is closer to the original than the double(b), I increment pro-direct conversion counter, in the opposite case I increment the pro-viaDecimal counter. I do it in a loop of 1 mln. cycles, then I print the ratio of pro-direct to pro-viaDecimal counters. The ratio turns out to be about 3.7, i.e. approximately in 4 cases out of 5 the conversion via decimal will spoil the number.

To prove the case when the numbers are typed in by users I used the same program with the only change that I apply Math.Round(originalDouble, N) to the doubles. Because I get originalDoubles from the Random class, they all will be between 0 and 1, so the number of significant digits coincides with the number of digits after the decimal point. I placed this method in a loop by N from 1 significant digit to 15 significant digits typed by user. Then I plotted it on the graph. The dependency of (how many times direct conversion is better than conversion via decimal) from the number of significant digits typed by user. The dependency of (how many times direct conversion is better than via decimal) from the number of significant digits typed by user.

As you can see, for 1 to 7 typed digits the conversion via Decimal is always better than the direct conversion. To be exact, for a million of random numbers only 1 or 2 are not improved by conversion to decimal.

Here is the code used for the comparison:

private static void CompareWhichIsBetter(int numTypedDigits)
{
    Console.WriteLine("Number of typed digits: " + numTypedDigits);
    Random rnd = new Random(DateTime.Now.Millisecond);
    int countDecimalIsBetter = 0;
    int countDirectIsBetter = 0;
    int countEqual = 0;

    for (int i = 0; i < 1000000; i++)
    {
        double origDouble = rnd.NextDouble();
        //Use the line below for the user-typed-in-numbers case.
        //double origDouble = Math.Round(rnd.NextDouble(), numTypedDigits); 

        float x = (float)origDouble;
        double viaFloatAndDecimal = (double)Convert.ToDecimal(x);
        double viaFloat = x;

        double diff1 = Math.Abs(origDouble - viaFloatAndDecimal);
        double diff2 = Math.Abs(origDouble - viaFloat);

        if (diff1 < diff2)
            countDecimalIsBetter++;
        else if (diff1 > diff2)
            countDirectIsBetter++;
        else
            countEqual++;
    }

    Console.WriteLine("Decimal better: " + countDecimalIsBetter);
    Console.WriteLine("Direct better: " + countDirectIsBetter);
    Console.WriteLine("Equal: " + countEqual);
    Console.WriteLine("Betterness of direct conversion: " + (double)countDirectIsBetter / countDecimalIsBetter);
    Console.WriteLine("Betterness of conv. via decimal: " + (double)countDecimalIsBetter / countDirectIsBetter );
    Console.WriteLine();
}
farfareast
  • 2,179
  • 1
  • 23
  • 22
  • I didn't suggest using direct conversion instead of `(double)Convert.ToDecimal(input)`. I suggested `Math.Round`. Which will be as good as `Convert.ToDecimal` in every case, but faster (if you use a sensible algorithm for computing the rounding place). – Ben Voigt Nov 07 '12 at 03:12
1

Here's a different answer - I'm not sure that it's any better than Ben's (almost certainly not), but it should produce the right results:

float readFromFile = 0.012f;
decimal forUse = Convert.ToDecimal(readFromFile.ToString("0.000"));

So long as .ToString("0.000") produces the "correct" number (which should be easy to spot-check), then you'll get something you can work with and not have to worry about rounding errors. If you need more precision, just add more 0's.

Of course, if you actually need to work with 0.012f out to the maximum precision, then this won't help, but if that's the case, then you don't want to be converting it from a float in the first place.

farfareast
  • 2,179
  • 1
  • 23
  • 22
Bobson
  • 13,498
  • 5
  • 55
  • 80
  • Oops. Thanks for catching that, @farfareast. – Bobson Nov 06 '12 at 18:48
  • Working with decimals internally instead of doubles (if you mean this) is not an option because of several reasons: 1. (most important) in my tests arithmetic operations with decimals are about 15 times slower than with doubles and our app is calculation intensive; 2. Many standard C# math methods (like Math.Sqrt for example) do not have overloads taking decimals as parameters, so we will have to convert decimals to doubles and loose precision in many cases. – farfareast Nov 07 '12 at 00:24
  • @farfareast - Yes, that was what I itended. Those are both quite good reasons, however, so this is clearly not an option. I'm not at all surprised that it's slower, because you're effectively trading speed for precision by using them. However, I'm **really** surprised there's no decimal form of `Math.Sqrt`. – Bobson Nov 07 '12 at 05:03
  • +1 It is sensible option to reduce rounding errors. Just not good for my circumstances but can be used by somebody else if they are not so performance sensitive. I would also change the format string from "0.000" to "G4" to demand 4 (or other number of) significant digits - no matter where the decimal point is. – farfareast Nov 07 '12 at 14:03
  • @farfareast - Out of curiosity, I just found an algorithm for a `Sqrt` for a `decimal` and did a performance test. On a decimal, it takes 4000ish ticks to find the square root out to the maximum precision of a decimal, and 13ish ticks to find the root out to the maximum of a double (which is around half as far). So it's a **very** clear precision-vs-speed tradeoff, but you don't need to worry about the lack of [`Sqrt`](http://stackoverflow.com/a/13282997/298754). – Bobson Nov 08 '12 at 05:33
  • @Bobson: I wonder how that `Decimal` square-root routine was implemented? I would expect that it should be possible to scale the value up to a big integer (probably represented using three `UInt64` variables to represent overlapping pieces), use `Double.Sqrt` to compute an approximation, use "manually-implemented" multi-precision integer maths to refine that approximation to an accurate integer square root, and then convert the result to a properly-scaled `Decimal`. It still wouldn't be nearly as fast as `Double.Sqrt`, but I'd expect it to be a lot less than 300x slower. – supercat Feb 04 '13 at 17:37
  • @supercat - Assuming that this method is the one I preserved in my utility library (which I think it is, but I'm not positive), then it was a one-function recursive version of [Newton's method](http://en.wikipedia.org/wiki/Newton%27s_method), applied until precision limits hid any further changes. I don't have the benchmark code any more, though. – Bobson Feb 04 '13 at 18:42