0

How to convert yyyymmddhhmmss in PySpark dataframe.

Example: 20180718093158 is my input and I want the result like 2018-07-18 09:31:58

  • Duplicate of https://stackoverflow.com/questions/38080748/convert-pyspark-string-to-date-format – jdaz May 29 '20 at 19:54
  • Does this answer your question? [Convert pyspark string to date format](https://stackoverflow.com/questions/38080748/convert-pyspark-string-to-date-format) – Hossein Torabi May 29 '20 at 20:32
  • Do the given solution doesn't help. My format of the input is 20180718093158 not like 2018/07/18 093158 – Mathi vanan May 30 '20 at 15:20
  • 1
    what about this df = df.select( 'your_input', from_unixtime(unix_timestamp('your_input', 'yyyyMMddHHmmss')).alias('datetime') ) – DanG May 30 '20 at 16:49
  • No, it is also not working. I tried to string split and concatenation operations. But I'm looking for any kind of timestamp conversion. – Mathi vanan Jun 01 '20 at 06:19

1 Answers1

0

First, cast your "date" column to string and then apply to_timestamp() function with format "yyyyMMddHHmmSS" as the second argument, i.e.

from pyspark.sql import functions as F

df = withColumn(
    "date", 
    F.to_timestamp(F.col("date").cast("string"), "yyyyMMddHHmmSS")
)