The following is interesting:
val rddSTG = sc.parallelize(
List ( ("RTD","ANT","SOYA BEANS", "20161123", "20161123", 4000, "docid11", null, 5) ,
("RTD","ANT","SOYA BEANS", "20161124", "20161123", 6000, "docid11", null, 4) ,
("RTD","ANT","BANANAS", "20161124", "20161123", 7000, "docid11", null, 9) ,
("HAM","ANT","CORN", "20161123", "20161123", 1000, "docid22", null, 33),
("LIS","PAR","BARLEY", "20161123", "20161123", 11111, "docid33", null, 44)
)
)
val dataframe = rddSTG.toDF("ORIG", "DEST", "PROD", "PLDEPDATE", "PLARRDATE", "PLCOST", "docid", "ACTARRDATE", "mutationseq")
dataframe.createOrReplaceTempView("STG")
spark.sql("SELECT * FROM STG ORDER BY PLDEPDATE DESC").show()
It generates an error as follows:
scala.MatchError: Null (of class scala.reflect.internal.Types$TypeRef$$anon$6)
As soon as I change one of the null values to non-null its works. I think I get it, in that no inference can be made on the field, but it does seem odd. Ideas?