I am using spark to read data from a Hive table, and what I really want is a strongly typed Dataset
Here's what I am doing, and this works:
val myDF = spark.sql("select col1, col2 from hive_db.hive_table")
// Make sure that the field names in the case class exactly match the hive column names
case class MyCaseClass (col1: String, col2: String)
val myDS = myDF.as[myCaseClass]
The problem I have is that my Hive table is very long and many of the columns are structs, so its not trivial to define the case class
Is there a way to create a Dataset
without the need to create a case class? I was wondering that since Hive already has all the column names defined as well as the data types is there a way to create a Dataset
directly?