2

I have a pyspark dataframe that looks like the following

    df
    
       year   month   day 
       2017    9       3 
       2015    5      16

I would like to create a column as datetime like the following

    df
    
       year   month   day           date
       2017    9       3    2017-09-03 00:00:00
       2015    5      16    2017-05-16 00:00:00
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
emax
  • 6,965
  • 19
  • 74
  • 141
  • Have a look at [this answer](https://stackoverflow.com/a/33683462/14246131) to a similar question. – FelixF Sep 30 '20 at 09:38

1 Answers1

1

You can use concat_ws to concat and convert to date using to_date

from pyspark.sql.functions import *
df = spark.createDataFrame([[2017,9,3 ],[2015,5,16]],['year', 'month','date'])
df = df.withColumn('timestamp',to_date(concat_ws('-', df.year, df.month,df.date)))
df.show()

+----+-----+----+----------+
|year|month|date| timestamp|
+----+-----+----+----------+
|2017|    9|   3|2017-09-03|
|2015|    5|  16|2015-05-16|
+----+-----+----+----------+

Schema:

df.printSchema()
root
 |-- year: long (nullable = true)
 |-- month: long (nullable = true)
 |-- date: long (nullable = true)
 |-- timestamp: date (nullable = true)
Equinox
  • 6,483
  • 3
  • 23
  • 32