-1

I have a date value in a column of string type that takes this format:

06-MAY-16 09.17.15

I want to convert it to this format:

20160506

I have tried using DATE_FORMAT(TO_DATE(<column>), 'yyyyMMdd') but a NULL value is returned.

Does anyone have any ideas about how to go about doing this in pyspark or spark SQL?

Thanks

Ronak Jain
  • 3,073
  • 1
  • 11
  • 17
  • Does this answer your question? [Pyspark date yyyy-mmm-dd conversion](https://stackoverflow.com/questions/50607059/pyspark-date-yyyy-mmm-dd-conversion) – Kashyap Jan 09 '23 at 21:58

3 Answers3

0

I've got it! This is the code I used which seems to have worked:

FROM_UNIXTIME(UNIX_TIMESTAMP(<column>, 'dd-MMM-yy HH.mm.ss'), 'yyyyMMdd')

Hope this helps others!

0

Your original attempt is close to the solution. You just needed to add the format in the TO_DATE() function. This will work as well:

DATE_FORMAT(TO_DATE(<col>, 'dd-MMM-yy HH.mm.ss'), 'yyyyMMdd')

And for pyspark:

import pyspark.sql.functions as F
df = df.withColumn('<col>', F.date_format(F.to_date(F.col('<col>'), 'dd-MMM-yy HH.mm.ss'), 'yyyyMMdd'))
gamezone25
  • 288
  • 2
  • 10
0

Convert your string to a date before you try to 'reformat' it.

Kashyap
  • 15,354
  • 13
  • 64
  • 103