I need to create a new column in a spark dataframe that contains a date type. Now it is a string: "Mon Jan 09 2021" but it needs to be as follows: YYYY-MM-DD.
I tried several ways but they all failed. How do you need to do this?
I need to create a new column in a spark dataframe that contains a date type. Now it is a string: "Mon Jan 09 2021" but it needs to be as follows: YYYY-MM-DD.
I tried several ways but they all failed. How do you need to do this?
January 9, 2021 is a Saturday, so first you should check that it's your Dataframe contains valid data, otherwise this won't work. Assuming the column with the string data is called date_string
, and we change the string data to `"Sat Jan 09 2021", we can use
SELECT
TO_DATE(date_string, "EEE MM dd yyyy") AS date
to cast the data into date. By default, it should display in the format you want, but in case it doesn't, you can use date_format
to specify the format you need. The specification for the format string to pass in is here. This specification is also used in the to_date()
function I used above.