Issue with division double in Spark

Question

After division, I get wrong value:

...
.withColumn("percentage", regexp_replace(lit("10.62%"), "%", "").cast("double") / 100)
...

Expected value: 0.1062

Received value: 0.10619999999999

What's the problem with it? Are there any solutions without round operation?

@AbdennacerLachiheb Yes, You're right. I think this problem can be solved only by "round" operation. I had a similar problem in another situation. And round was only solution. — Alexander Lopatin, Apr 07 '23 at 09:58

Abdennacer Lachiheb · Answer 1 · 2023-04-07T10:23:04.583

This is not related to spark, you can try that division in pretty much any programming languages, and you will get the same result

print(10.62 / 100) # result is always 0.10619999999999999

In most programming languages, mathematical operations, are based on the IEEE 754 standard.

Here's a more clear explanation of how IEEE 754 standard works:

In the IEEE-754 standard, hardware designers are allowed any value of error/epsilon as long as it's less than one half of one unit in the last place, and the result only has to be less than one half of one unit in the last place for one operation.

I'm afraid that the only solution is rounding.

Another possible solution is to use BigDecimal, but you have to put a precision, but obly for the input, you don't have to care about the precision od the result:

df.withColumn("percentage", regexp_replace(lit("10.62%"), "%", "").cast(DecimalType(10, 2)) / 100)

Issue with division double in Spark

1 Answers1