How to convert a unix timestamp column in a human comprehensible timestamp in PySpark?

Question

I have a column containing unix-timestamp data interpreted as Long type by Spark, for example:

+---------------+
| my_timestamp  | 
+---------------+
| 1584528257638 |
| 1586618807677 |
| 1585923477767 |
| 1583314882085 |

I'd like to convert it into a human readable format and for example having something like

+------------------------+
|      my_timestamp      | 
+------------------------+
|2020-03-18 10:44:17.638 |
|2020-04-11 16:26:47.677 |
|2020-04-03 15:17:57.767 |
|2020-03-04 09:41:22.085 |

how can I do that?

@AndréMachado I just wanted to share my approach on converting unix timestamp with a simple type casting (see the answer) as very useful to me and I don't see it very used — Vzzarr, Feb 09 '21 at 12:20
guys I don't get it... I'm asking for a conversion to timestamp and the question you reported as potential duplicates are about converting to date (besides one is in Scala...) Wouldn't it be better to have some discussion before marking as duplicate? — Vzzarr, Feb 10 '21 at 10:10

score 0 · Answer 1 · answered Feb 09 '21 at 12:05

As the timestamp column is in milliseconds is just necessary to convert into seconds and cast it into TimestampType and that should do the trick:

from pyspark.sql.types import TimestampType
import pyspark.sql.functions as F

df.select( 
      (F.col("my_timestamp") / 1000).cast(TimestampType())
)

How to convert a unix timestamp column in a human comprehensible timestamp in PySpark?

1 Answers1

Linked