I'm trying to write a spark UDF that replaces the null values of a Double field with 0.0. I'm using the Dataset API. Here's the UDF:
val coalesceToZero=udf((rate: Double) => if(Option(rate).isDefined) rate else 0.0)
This is based on the following function that I tested to be working fine:
def cz(value: Double): Double = if(Option(value).isDefined) value else 0.0
cz(null.asInstanceOf[Double])
cz: (value: Double)Double
res15: Double = 0.0
But when I use it in Spark in the following manner the UDF doesn't work.
myDS.filter($"rate".isNull)
.select($"rate", coalesceToZero($"rate")).show
+----+---------+
|rate|UDF(rate)|
+----+---------+
|null| null|
|null| null|
|null| null|
|null| null|
|null| null|
|null| null|
+----+---------+
However the following works:
val coalesceToZero=udf((rate: Any) => if(rate == null) 0.0 else rate.asInstanceOf[Double])
So I was wondering if Spark has some special way of handling null Double values.