How should I properly perform date time parsing with spark 2.0 dataset API?
There are lots of samples for data frame / RDD like
- Spark date parsing
- Better way to convert a string field into timestamp in Spark
- How to change the column type from String to Date in DataFrames?
A class like
case class MyClass(myField:java.sql.Datetime)
val mynewDf = spark.read
.option("header", "true")
.option("inferSchema", "true")
.option("charset", "UTF-8")
.option("delimiter", ",")
.csv("pathToFile.csv")
.as[MyClass]
Is not enough to cast the type. How should I perform this properly using the data set API?
edit
loading the data works. Eg. a print schema
shows myDateFiled: timestamp (nullable = true)
But a myDf.show results in a
java.lang.IllegalArgumentException
at java.sql.Date.valueOf(Date.java:143)
which lead me to believe that my parsing of the dates was incorrect. How can this be?