0

My current data is in this format: 2013-07-25 00:00:00.0,

orders.take(10).foreach(println)

1,2013-07-25 00:00:00.0,11599,CLOSED
2,2012-07-25 00:00:00.0,256,PENDING_PAYMENT
3,2011-07-25 00:00:00.0,12111,COMPLETE
4,2014-07-25 00:00:00.0,8827,CLOSED
5,2015-07-25 00:00:00.0,11318,COMPLETE
6,2016-07-25 00:00:00.0,7130,COMPLETE
7,2017-07-25 00:00:00.0,4530,COMPLETE
8,2018-07-25 00:00:00.0,2911,PROCESSING
9,2019-07-25 00:00:00.0,5657,PENDING_PAYMENT
10,2009-07-25 00:00:00.0,5648,PENDING_PAYMENT

I know how to convert the string to int:

val ordersMap = orders.map(a=>(
a.split(",")(0).toInt, 
a.split(",")(1), 
a.split(",")(2).toInt, 
a.split(",")(3)
))

But, for the second column date in string format, I am looking for a easy way like .toInt, all I want is to parse it into a datetime.

I wonder if there is a simple way to do that on all the rows in the dataframe, and if there is a flexible way to accommodate different datetime formats, like yyyy/mm/dd, mm/dd/yyyy, dd/mm/yyyy, etc.

Thank you.

[UPDATE1] Thanks to @smac89's suggestion, I tried with no luck, screenshot is here:

enter image description here

mdivk
  • 3,545
  • 8
  • 53
  • 91
  • Try [How to Convert String to LocalDateTime in Java 8 - Example Tutorial](https://www.java67.com/2016/04/how-to-convert-string-to-localdatetime-in-java8-example.html) – Ole V.V. Feb 04 '20 at 15:22
  • Does this answer your question? [How to parse/format dates with LocalDateTime? (Java 8)](https://stackoverflow.com/questions/22463062/how-to-parse-format-dates-with-localdatetime-java-8) – Ole V.V. Feb 04 '20 at 15:23
  • Are you using `apache-spark`? If so, you can load directly as `Timestamp` datatype. – Cesar A. Mostacero Feb 05 '20 at 21:36
  • Thank you Cesar, can you write up an answer with screenshot please? I am using Databricks – mdivk Feb 05 '20 at 23:07

2 Answers2

0

You can just do LocalDate.parse as in the duplicate, but AFAIK there is no such extension for dates. You can easily create your own though:

implicit class StringDates(ds: String) {
    def toLocalDate: LocalDate = ds.toLocalDate(DateTimeFormatter.ISO_LOCAL_DATE)
    def toLocalDate(fmt: DateTimeFormatter): LocalDate = LocalDate.parse(ds, fmt)
}

Now you can do:

"2013-07-25".toLocalDate

Or pass in a formatter by doing:

"2013-07-25".toLocalDate(fmt)

Try it on Scastie 1
Try it on Scastie 2

You can create more formatters easily by doing:

DateTimeFormatter.ofPattern("yyyy/mm/dd")

See Patterns for Formatting and Parsing

smac89
  • 39,374
  • 15
  • 132
  • 179
  • @mdivk that was my mistake. For some reason, I thought we were talking about `kotlin` and didn't see the `scala` tag. I will change my answer – smac89 Feb 03 '20 at 13:59
  • Thank you @smac89, would it be possible for you to share a bit more details? like is there any libraries to import? Can you cut a screenshot of your notebook? – mdivk Feb 03 '20 at 15:27
  • Thank you, I will check it out later and update here – mdivk Feb 03 '20 at 16:15
  • Thank you for the suggestion in your edit, that's for Java, not Scala – mdivk Feb 04 '20 at 03:10
0

Here is what I ended with, cumbersome but working:

import java.time._
import java.time.format.DateTimeFormatter
import org.apache.spark.sql.functions._

......

val datetime_format = DateTimeFormatter.ofPattern("yyyy-MM-dd")
val test="2013-07-25 00:00:..."
val myd = test.substring(0,10).format(datetime_format)
val mydate = datetime_format.parse(myd)
mdivk
  • 3,545
  • 8
  • 53
  • 91