The following code creates a empty Dataset in Spark..
scala> val strings = spark.emptyDataset[String]
strings: org.apache.spark.sql.Dataset[String] = [value: string]
The signature of emptyDataset is..
@Experimental
@InterfaceStability.Evolving
def emptyDataset[T: Encoder]: Dataset[T] = {
val encoder = implicitly[Encoder[T]]
new Dataset(self, LocalRelation(encoder.schema.toAttributes), encoder)
}
Why the above signature is not forcing T to be subtype of Encoder?
It accepts T as of type String and creates a encoder for String and pass that to Dataset constructor.It finally creates Dataset[String].