I have a function like this in Scala code (Scala 2.13) for use with Spark
def getDataset[T <: Product: TypeTag](name:String): Dataset[T] = {
import spark.implicits._
val ds = spark.read.parquet(BASE_PATH + "/" + name).as[T]
ds.createOrReplaceTempView(name)
ds
}
Now I want to turn a Seq
of case classes, and for each class, call this function:
case class CLASS1(...)
case class CLASS2(...)
case class CLASS3(...)
Seq(CLASS1, CLASS2, CLASS3, ....).foreach {
c => getDataset[c??](name=c???)
}
I'm having a hard time figuring out the exact syntax; the symbol for the name of the case class, represented by the variable c
inside the foreach
, seems to represent the type of the apply
method (() => Product
). What I really want is the type of the case class to use as the type parameter, and the name of the case class.
It feels like I should be able to do this - what am I missing here?
Update It looks like it's possible to get the name of the type used in a type parameter at runtime, via TypeTag
.
The solution I am converging on is something like this:
def getDataset[T <: Product: TypeTag]: Dataset[T] = {
import spark.implicits._
val name = typeTag[T].tpe.typeSymbol.name.toString
val ds = spark.read.parquet(BASE_PATH + "/" + name).as[T]
ds.createOrReplaceTempView(name)
ds
}
Then something like Seq(getDataset[CLASS1], getDataset[CLASS2], ...)
Not what I hoped for, but at least I can cut out the copy-paste of the class name and string.