need solution for floatings lack of precision

Question

by now i read a lot about this theme, not just now, but now again... and i thought i understand the point, and for sure it is a little bit awkward that i have to ask this, but i lack a proper solution whit which i feel right and save.

I used numpy and dtype numpy.flaot64, which i understand as an double precision, just to prevent the usual floating-problem. But now by testing its just the same. I have some billions (really) of calculations to do, (a further point why i chose np) so speed is of the essence, and i dont want do np.round() every step in my calculations... what would be a accurate result, because i just have this 3digits after the point. You can argue of course, why not multiply it whit 1000, or 10000, problem solved and numpy does such a thing in an instant. but it will lead to more problems, in further calculations, since there are a lot more calculations afterwards. Lets see the problem:

a = np.array([[7.125], [5.233]], dtype=np.float64)
b = np.array([[7.124], [5.232]], dtype=np.float64)
c = a - b
print(repr(c))


array([[0.001000000000000334 ],
       [0.0009999999999994458]])

easy enough!!! I need no explanation why this happens, i dont look for an workaround whit np.round() or tempering on np.set_printoptions(), which i know this wont change my data, but just the way i become presented it. I thought an numpy-64bit-double-precision (128 is sadly not possible because all the big processing i have to do happens on my flatmates Win-PC xD, and by now, i doubt it would solve my problem, but correct me if i am wrong!!!) which did not hold more than 10 digits ever, would be enough to do it precisely. look whats happen if i do this:

a = np.random.randint(1000, 1000000,(5, 2))
b = np.random.randint(1000, 1000000,(5, 2))
a = a / 1000
b = b / 1000
c = a - b
print(a.dtype)
print(c)

>>>float64
[[ 375.929 -833.91 ]
 [ 482.509 -106.411]
 [  -2.08   -64.672]
 [ 395.236 -383.997]
 [ 213.829 -101.08 ]]

no such precision "collapse" thats what i would like.

So is there a "right" way to do it???
Thanks to listen to my story^^ and sorry to bring this problem to the table again^^ best regards

A 64-bit double actually holds about 17 decimal digits, although not exactly. Without more specific code it's impossible to make recommendations. You've probably already seen [Is floating point math broken?](https://stackoverflow.com/q/588004/5987) — Mark Ransom, Aug 16 '19 at 22:44
I'm not sure what to say, other than you'll get used to it. 128bit floats aren't going to help, they're still base-2 representations of numbers and not base-10. numbers you might expect to be convenient for a computer aren't, they're also represented very accurately it's just that when you print a decimal expansion to full accuracy they can look a little wonky — Sam Mason, Aug 17 '19 at 15:16

score 0 · Answer 1 · answered Aug 16 '19 at 17:36

double precision floating point numbers have a lot of precision, you just need to get experience of when and where it matters. some of the biggest models are trained with 32-bit floats (mostly for runtime performance) or even 16bit variants

there are a few functions like log1p and expm1 that can help in obscure places (although I've written code that uses the above functions I'm not sure if I've even had values where it's actually made a difference, and I've written numeric codes that have run for many CPU years of time). another useful thing to know about is catastrophic cancellation. rewriting maths/equations can help a lot with numerical stability, though knowing where and when to do this can be difficult. using standard algorithms really helps, e.g. spend some time looking through numpy.linalg as a lot of work has gone into making sure they are well behaved

rounding is almost never the right thing to do and will tend to make matters significantly worse, especially when done on intermediate results. rounding should almost always just be for final display. e.g. you can easily make calculations 15 orders of magnitude of worse by rounding at the wrong time. e.g. where a=1.5 and b=1.499999999999999, a-b ~ 1e-15, while round(a) - round(b) is 1. note that these are just a single bit different, and you should expect a few bits of error to accumulate, easily pushing values either side of rounded boundaries. also, for values that don't encode nicely into a binary floating point number this can happen in awkward places

values like 0.001 can't be represented accurately and are always slightly "wrong". e.g. try decimal.Decimal(0.001) or 0.001.hex() to get the value that's actually being represented

thanks for the answer. I am to new here so my vote will actually not be accepted but my thumbs up will be recorded it says. I will try to implement decimal.Decimal() but I have to look for the speed of my code as well. But its a workaround as well i think. but thanks!!! Pls look at my question i add a new excemple. — peter frost, Aug 17 '19 at 12:21
I only pointed out `decimal` to help you understand what a float is representing, you almost certainly want to be using floats as they're orders of magnitude faster — Sam Mason, Aug 17 '19 at 15:05

need solution for floatings lack of precision

1 Answers1