I'm using Spark Datasets to read in csv files. I wanted to make a polymorphic function to do this for a number of files. Here's the function:
def loadFile[M](file: String):Dataset[M] = {
import spark.implicits._
val schema = Encoders.product[M].schema
spark.read
.option("header","false")
.schema(schema)
.csv(file)
.as[M]
}
The errors that I get are:
[error] <myfile>.scala:45: type arguments [M] do not conform to method product's type parameter bounds [T <: Product]
[error] val schema = Encoders.product[M].schema
[error] ^
[error] <myfile>.scala:50: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
[error] .as[M]
[error] ^
[error] two errors found
I don't know what to do about the first error. I tried adding the same variance as the product definition (M <: Product), but then I get the error "No TypeTag available for M"
If I pass in the schema already produced from the encoder, I then get the error:
[error] Unable to find encoder for type stored in a Dataset