3

I am trying to write code to convert date-time columns date and last_updated_date which are actually unix times cast as doubles into "mm-dd-yyyy" format for display. How do I do this ?

import org.joda.time._
import scala.tools._
import org.joda.time.format.DateTimeFormat._
import java.text.SimpleDateFormat
import org.apache.spark.sql.functions.{unix_timestamp, to_date}
root
 |-- date: double (nullable = false)
 |-- last_updated_date: double (nullable = false)
 |-- Percent_Used: double (nullable = false)

+------------+---------------------+------------+
|        date|    last_updated_date|Percent_Used|
+------------+---------------------+------------+
| 1.453923E12|        1.47080394E12| 1.948327124|
|1.4539233E12|        1.47080394E12| 2.019636442|
|1.4539236E12|        1.47080394E12| 1.995299371|
+------------+---------------------+------------+
Leothorn
  • 1,345
  • 1
  • 23
  • 45

3 Answers3

6

Cast to timestamp:

df.select(col("date").cast("timestamp"));
  • Hmm i think there is some issue with this since date is "long" and that is causing the dates to go out of whack when doing this directly – Leothorn Aug 29 '16 at 22:37
1

Convert it to a timestamp using from_unixtime:

df.select(from_unixtime("date").as("date"))
Justin Pihony
  • 66,056
  • 18
  • 147
  • 180
1

Fetching datetime from float in Python

This answer works for me give a try actually its a seconds calculation

import datetime serial = 43822.59722222222 seconds = (serial - 25569) * 86400.0 print(datetime.datetime.utcfromtimestamp(seconds))

Convert excel timestamp double value into datetime or timestamp

Bala cse
  • 119
  • 1
  • 2