I have a column containing string data like "2023-03-13T15:18:14+0700". My final goal is to convert it to a proper date format like "2023-03-13 15:18:14". It's best to convert the time to GMT+7 (my location) and then remove the "T" and "+XXXX" part. But if it's too hard or impossible to do, I just need to remove the "T" and "+0700" since most of my data is "+0700".
I read many posts on SOF but had no luck so far. For example, here, here, and the closest one is this but no luck since their format is a bit different from mine.
Below is what I got from the latest post:
object test extends App {
val spark = SparkSession.builder().master("local[*]").getOrCreate()
import spark.implicits._
val df = Seq("2023-03-13T15:18:14+0700").toDF("time")
val result = df.select(to_timestamp(col("time"), "yyyy-MM-dd'T'hh:mm:ss.SSSXXX").alias("newtime"))
result.show(truncate = false) // Null
val result1 = df.select(to_timestamp(col("time"), "yyyy-MM-dd'T'hh:mm:ssXXX").alias("newtime"))
result1.show(truncate = false) // Null
}