0

I'm trying to understand how floating point operations give different output. The tests performed below result to 1; however vb.net and c# give different output, whereas java gives a different output. It probably has to do something with compiler and I read What Every Computer Scientist Should Know About Floating-Point Arithmetic but its confusing, can someone explain in simple language?

VB.NET

Dim x As Single = 1.000001
Dim y As Single = 0.000001

Dim result = x - y

Output: 0.9999999 Click here to see output Same goes for C#

Also, while watching the variable in visual studio, the value of result is different from what it outputs, which is trimmed while printing and only seven 9's are printed(which is understood), but I dont understand how the actual result for them is 0.99999994 VS floating operation output

Update: Alright, I'm more specifically interested in how this calculation is done(removed java stuff)

Polynomial Proton
  • 5,020
  • 20
  • 37
  • The simple answer is that floating point is only an **approximation** of the value. If you want to get predictable results, use the Decimal data type. – Blackwood Jun 02 '15 at 21:40
  • *The tests performed below result to 1* All results give something different than 1. What exactly do you mean? If you are interested whether the numbers are equal, check the binary representation of the results. For Java you can get the binary representation via [Long.toBinaryString(Double.doubleToRawLongBits(d))](http://stackoverflow.com/questions/6359847/convert-double-to-binary-representation) – Turing85 Jun 02 '15 at 21:40
  • @Silvermind I updated my question to be more specific, I'm not comparing float and double – Polynomial Proton Jun 02 '15 at 21:44
  • @Blackwood Yes, Decimal works fine. I'm interested in learning how compiler performs these operations to get the output for floating point operations – Polynomial Proton Jun 02 '15 at 21:45
  • @TheUknown: I suggest just regarding these differences as random discrepancies caused by the approximate nature of floating point numbers. Ignore the discrepancies, or avoid them by using appropriate data types when you need a precise result. – Blackwood Jun 02 '15 at 22:00
  • @Blackwood Thats fine by me. I'm not stuck on using float/single. I'm just curious about calculations. Thanks for your replies though :) – Polynomial Proton Jun 02 '15 at 22:01
  • 1
    Using this [site](http://www.h-schmidt.net/FloatConverter/IEEE754.html) you could see the binary representation and the approximation of numbers in double precision. for `1.000001` it gives `1.0000009536743164` ; for `0.000001` it gives `0.0000009999999974752427` (9.999999974752427e-7) and the result of the subtraction gives `0.9999999536743189247573` which when pasted in the link given above round him to `0.99999994` (because of simple precision probably) – Sehnsucht Jun 02 '15 at 22:13
  • Step 1 is to read [this article](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html). It is fantastic on explaining how floating point numbers work. – Icemanind Jun 02 '15 at 22:31
  • 1
    Basically numbers in the form 1/10 1/100 1/1000 and so on are not representable in binary. This has nothing to do with float/double - no matter the precision you can't give a precise value to these numbers in binary. Just like 1/3 cannot be represented with the decimal system because it is 0.3 with endless 3. – Mystra007 Jun 16 '15 at 02:03

1 Answers1

2

Numbers in visual basic Single (or c# float) are stored as IEEE 754-2008. The number is stored in 32 bits. first bit is the sign the next 8 bits store the exponent and the next 23 bits store the fraction.

First the integer part of the number is converted to base 2. then the fraction is converted to base 2 and the number is shifted with the right exponent to matches this format:

1.x1..x23 x 2^e

where x1 to x23 are the bits in the fraction part and e is the exponent.

For example 0.25 is converted to: 1.0 x 2^-2

Note that the significant digits are limited to 23 bits.

In your example 1.000001 is converted to 1.<20 zeors>101001... however we can only take the first 23 digits (1.<20 zeors>101). However for the 0.000001 we can start the 23 digits from the first 1 (which is bit 20th) and use the exponent -20 and store the number in much higher precision.

Nader
  • 152
  • 9