1

I have a column Time in my spark df. It is a string type. I need to convert it to datetime format. I have tried the following:

data.select(unix_timestamp(data.Time, 'yyyy/MM/dd HH:mm:ss').cast(TimestampType()).alias("timestamp"))

data.printSchema()

The output is:

root
 |-- Time: string (nullable = true)

If I save it in a new df, then I am losing all of my other columns.

Chique_Code
  • 1,422
  • 3
  • 23
  • 49
  • 2
    Does this answer your question? [Convert pyspark string to date format](https://stackoverflow.com/questions/38080748/convert-pyspark-string-to-date-format) – werner Oct 19 '20 at 19:05
  • It does not, I have checked the resources before posting the question. – Chique_Code Oct 19 '20 at 19:37

1 Answers1

1

You can use withColumn instead of select

data = spark.createDataFrame([('1997/02/28 10:30:00',"test")], ['Time','Col_Test'])

df = data.withColumn("timestamp",unix_timestamp(data.Time, 'yyyy/MM/dd HH:mm:ss').cast(TimestampType()))

Output :

>>> df.show()
+-------------------+--------+-------------------+
|               Time|Col_Test|          timestamp|
+-------------------+--------+-------------------+
|1997/02/28 10:30:00|    test|1997-02-28 10:30:00|
+-------------------+--------+-------------------+

>>> data.printSchema()
root
 |-- Time: string (nullable = true)
 |-- Col_Test: string (nullable = true)

>>> df.printSchema()
root
 |-- Time: string (nullable = true)
 |-- Col_Test: string (nullable = true)
 |-- timestamp: timestamp (nullable = true)

Sanket9394
  • 2,031
  • 1
  • 10
  • 15