1

I have a case class which contains a enumeration field "PersonType". I would like to insert this record to a Hive table.

object PersonType extends Enumeration {
  type PersonType = Value
  val BOSS = Value
  val REGULAR = Value
}

case class Person(firstname: String, lastname: String)
case class Holder(personType: PersonType.Value, person: Person)

And:

val hiveContext = new HiveContext(sc)
import hiveContext.implicits._

val item = new Holder(PersonType.REGULAR, new Person("tom", "smith"))
val content: Seq[Holder] = Seq(item)

val data : RDD[Holder] = sc.parallelize(content)
val df = data.toDF()

... When I try to convert the corresponding RDD to DataFrame, I get the following exception:

Exception in thread "main" java.lang.UnsupportedOperationException:     
Schema for type com.test.PersonType.Value is not supported   
        ...
        at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:691)
        at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:30)
        at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:630)
        at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:30)
        at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:414)
        at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:94)    

I'd like to convert PersonType to String before inserting to Hive. Is it possible to extend the implicitconversion to handle PersonType as well? I tried something like this but didn't work:

 object PersonTypeConversions {
    implicit def toString(personType: PersonTypeConversions.Value): String = personType.toString()
 }
import PersonTypeConversions._

Spark: 1.6.0

zero323
  • 322,348
  • 103
  • 959
  • 935
Bruckwald
  • 797
  • 8
  • 23
  • Possible duplicate of [How to define schema for custom type in Spark SQL?](http://stackoverflow.com/questions/32440461/how-to-define-schema-for-custom-type-in-spark-sql) – zero323 Jun 24 '16 at 18:51
  • Thanks @zero323 I already checked your answer in this question but wondering whether there's another option as I try to avoid internal APIs that may change in the future (or already changed in 2.0.) – Bruckwald Jun 24 '16 at 18:55
  • As far as I am aware nothing that will give you a human readable output. :/ And this is already private in 2.0.0-preview – zero323 Jun 24 '16 at 19:04
  • Thanks. I'll then try get rid of the enumeration in my case class I think this is the easiest solution as far as I see – Bruckwald Jun 25 '16 at 08:26

0 Answers0