I'm new to spark. I want to convert spark dataframe rows to case classes, but some of the fields are dynamic and the data type can't be decided beforehand. So I want to use some generic case classes.
case class Property[T: ClassTag](val value: T)
case class PropertyList(d: Map[String, Property[_]])
case class RowHolder(val id: Option[String] = None,
val properties: Option[PropertyList]= None)
udf {row => RowHolder(id = row.getAs[String]("id")}
It seems that Spark is having problem recognizing the generic type and convert it to schema, because I got the matcherror
Exception in thread "main" scala.MatchError: Property[_] (of class scala.reflect.internal.Types$ExistentialType)
at org.apache.spark.sql.catalyst.ScalaReflection$class.getConstructorParameters(ScalaReflection.scala:838)
Is this related to spark type erasure
? Will custom serialization
be helpful? Any suggestions to resolve the issue?
Thanks!