0

I am trying to create a DataFrame from RDD, using a case class

I have observed the string fields are appearing nullable and double fields are non-nullable.

Please help me understand this behaviour

PS: I know that field can be made nullable by declaring it as Option[Double], but I wish to understand why this is happening?

scala> case class Airport(uuid:String, timestamp:String, iata:String, airport:String, city:String, state:String, country:String, lat:Double, long:Double)

scala> val ap_df = ap_nohdr.map(r => Airport(r(0).trim, r(1).trim, r(2).trim, r(3).trim, r(4).trim, r(5).trim, r(6).trim, r(7).trim.toDouble, r(8).trim.toDouble)).toDF();

scala> ap_df.printSchema
root
 |-- uuid: string (nullable = true)
 |-- timestamp: string (nullable = true)
 |-- iata: string (nullable = true)
 |-- airport: string (nullable = true)
 |-- city: string (nullable = true)
 |-- state: string (nullable = true)
 |-- country: string (nullable = true)
 |-- lat: double (nullable = false)
 |-- long: double (nullable = false)
Remis Haroon - رامز
  • 3,304
  • 4
  • 34
  • 62
  • Please show the definition of `ap_nohdr` and explain why you're not just reading that as a Dataset to begin with. For instance, using the CSV reader – OneCricketeer Feb 04 '18 at 06:44
  • @cricket_007, thankyou for your suggestion, I am aware of directly loading it using a csv reader.. here I am trying it using RDD way. intention is to learn and understand all nuances Spark&Scala offers – Remis Haroon - رامز Feb 04 '18 at 06:47
  • A double is a number. A number cannot be nulled... Is that what you're looking for? – OneCricketeer Feb 04 '18 at 06:49
  • @cricket_007, yes , in those lines.. why a String can be nulled & Double cannot be nulled.. as per my understanding both are scala objects? (or is it not?? , is it a primitive?) – Remis Haroon - رامز Feb 04 '18 at 06:51

1 Answers1

1

A Scala String, like that in Java, is an object. It can be nulled.

A Scala Double is an alias to a Java primitive double. It is not a nullable object compared to java.lang.Double (which you're welcome to use in the case class)

You can also refer to this section of the Scala docs on the Null object, which applies to Doubles as well

Since Null is not a subtype of value types, null is not a member of any such type. For instance, it is not possible to assign null to a variable of type scala.Int.

As you have discovered, the Option class is how you indicate a "nullable primitive"

scala: assign null to primitive

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245