This is different from How to create Spark Dataset or Dataframe from case classes that contains Enums. In this question I want to know how to create a Dataframe, not a DataSet.
I have been trying to create Spark Dataframe using case classes that contain Enums but I'm not able to. I'm using Spark version 1.6.0. The exceptions is complaining about that the schema is not supported for my Enum. Is this not possible in Spark, to have enums in the data and create dataframes?
Code:
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
object MyEnum extends Enumeration {
type MyEnum = Value
val Hello, World = Value
}
case class MyData(field: String, other: MyEnum.Value)
object EnumTest {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setAppName("test").setMaster("local[*]")
val sc = new SparkContext(sparkConf)
val sqlCtx = new SQLContext(sc)
import sqlCtx.implicits._
val df = sc.parallelize(Array(MyData("hello", MyEnum.World))).toDF()
println(s"df: ${df.collect().mkString(",")}}")
}
}
Exception:
Exception in thread "main" java.lang.UnsupportedOperationException: Schema for type com.nordea.gpdw.dq.MyEnum.Value is not supported