I've a DataFrame with a TimestampType column, I'm reading the data manually then constructing the DataFrame. In the input, the original DateTime column has timezone information, e.g. 2011-11-04T00:05:23+04:00
Now when I read the data in a Spark Timestamp column I realized that the timezone is gone!
This is how I am constructing the schema for my DataFrame
var fields = ...
fields = fields :+ StructField("timestamp", TimestampType, false)
val schema StructType(fields)
And how I parse the dates into a java.sql.Timestamp
val date = new Timestamp(x)
I've end up adding a separate column that contains the TimeZone, but is there a better option (other than making the column a StringType and serializing the original date)?