I have the following Dataframe :
val df = Seq(
("a",Some(1.0)),
("b",None),
("c",Some(3.0))
).toDF("id","x")
df.show()
+---+----+
| id| x|
+---+----+
| a| 1.0|
| b|null|
| c| 3.0|
+---+----+
Then I do
df.as[(String,Double)]
.collect
.foreach(println)
(a,1.0)
(b,-1.0) <-- why??
(c,3.0)
So the null is converted to -1.0, why is that? I expected that it will be mapped to 0.0. Interestingly, thats indeed the case if I do :
df
.select($"x")
.as[Double]
.collect
.foreach(println)
1.0
0.0
3.0
I'm aware that In my case mapping to Option[Double]
or java.lang.Double
is the way to go, but I would still be interested in understanding what spark does with non-nullable types such as Double
.
I'm using Spark 2.1.1 with Scala 2.10.6 by the way