2

I define the class as follows:

class AbnormalSim2(Cust_Id:String,Trx_Dt:String,Cur_Bns_Bal:String){
  @BeanProperty var Cust_Id1:String = _
  @BeanProperty var Trx_Dt1:String = _
  @BeanProperty var Cur_Bns_Bal1:BigDecimal = _
}

then,

import spark.implicits._
implicit val mapEncode1r =    org.apache.spark.sql.Encoders.kryo[AbnormalSim2]
abnormal.select("Cust_Id","Trx_Dt","Cur_Bns_Bal").as[AbnormalSim2].show

there is an error

org.apache.spark.sql.AnalysisException: Try to map struct to Tuple1, but failed as the number of fields does not line up.

then I define the class as follows:

case class AbnormalSim(){
  @BeanProperty var Cust_Id:String = _
  @BeanProperty var Trx_Dt:String = _
  @BeanProperty var Cur_Bns_Bal:BigDecimal = _
}

import spark.implicits._
abnormal.select("Cust_Id","Trx_Dt","Cur_Bns_Bal").as[AbnormalSim].show

it is successful

my problem is that the T must be "case class" when we convert dataframe to dataset by using .as[T],Must T be "case class"?

  • Yes, `T` in `.as[T]` should be either primitive type or case class. Please refer to [this explanation](https://stackoverflow.com/a/39442829/631176) on Encoders, it's still most detailed one on SO. – gemelen Jun 16 '18 at 11:16

0 Answers0