12

I am using Apache Spark 2.0 and creating case class for mention schema for DetaSet. When i am trying to define custom encoder according to How to store custom objects in Dataset?, for java.time.LocalDate i got following exception:

java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
- field (class: "java.time.LocalDate", name: "callDate")
- root class: "FireService"
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:598)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:592)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:583)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
............

Following is by code:

case class FireService(callNumber: String, callDate: java.time.LocalDate)
implicit val localDateEncoder: org.apache.spark.sql.Encoder[java.time.LocalDate] = org.apache.spark.sql.Encoders.kryo[java.time.LocalDate]

val fireServiceDf = df.map(row => {
val dateFormatter = java.time.format.DateTimeFormatter.ofPattern("MM/dd /yyyy")
FireService(row.getAs[String](0),  java.time.LocalDate.parse(row.getAs[String](4), dateFormatter))
})

How we can define third party api's encoder for spark?

Update

When i create the encoder for whole case class, df.map.. map the object into binary, as below:

implicit val fireServiceEncoder: org.apache.spark.sql.Encoder[FireService] = org.apache.spark.sql.Encoders.kryo[FireService]

val fireServiceDf = df.map(row => {
 val dateFormatter = java.time.format.DateTimeFormatter.ofPattern("MM/dd/yyyy")
 FireService(row.getAs[String](0), java.time.LocalDate.parse(row.getAs[String](4), dateFormatter))
})

fireServiceDf: org.apache.spark.sql.Dataset[FireService] = [value: binary]

I am expecting map for FireService, but return binary of map.

zero323
  • 322,348
  • 103
  • 959
  • 935
Harmeet Singh Taara
  • 6,483
  • 20
  • 73
  • 126

1 Answers1

5

As the last comment there says, "if class contains a field Bar you need encoder for a whole object." You need to provide an implicit Encoder for FireService itself; otherwise Spark constructs one for you using SQLImplicits.newProductEncoder[T <: Product : TypeTag]: Encoder[T]. You can see from the type that it doesn't use any implicit Encoder parameters for fields, so it can't use presence of localDateEncoder.

Spark could be changed to handle this e.g. using the Shapeless library, or using macros directly; I don't know whether this is the plan in the future.

Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487
  • Hey @Alexey i got your point still I am not get exact reason, why we required full object formatter? – Harmeet Singh Taara Aug 03 '16 at 15:46
  • i got your point. I also update the question, because now my data is converted into Binary. When i am using Timestamp instead of LocalDate, dataschema build as FireService otherwise as a Binary. – Harmeet Singh Taara Aug 04 '16 at 12:45
  • Please ask that as a separate question. In general, don't edit a question to ask a different one. – Alexey Romanov Aug 04 '16 at 13:48
  • 4
    @AlexeyRomanov I'm facing the same issue. Do you have a code example of how to encode the whole project? Many Thanks! – Rock Aug 10 '16 at 04:18