Failing to convert a dataframe to a dataset of object with an enumeration custom field

Question

I am facing an issue when trying to convert a dataframe to a dataset of objects with a custom field.

In this code, I have a dataframe with two columns, country, and currency. I want to convert this into a dataset using the MyObj case class where the country is a string and currency is an enumeration.

Here is the code:

val schema = StructType(Seq(
  StructField("country", StringType),
  StructField("currency", StringType)
))

// Define the sample data
val data = Seq(
  ("France", "EUR"),
  ("USA", "DOLLAR"),
  ("Germany", "EUR")
)

// Create a DataFrame from the sample data
val df = sparkSession.createDataFrame(data).toDF(schema.fieldNames: _*)

class Currency extends Enumeration {
  type Currency = Value
  val EUR = Value("EUR")
  val DOLLAR = Value("DOLLAR")
}

case class MyObj(country: String, currency: Currency)

val dsProduct = df.as[MyObj](Encoders.product[MyObj])

Here is the error I face when executing the program:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Try to map struct<country:string,currency:string> to Tuple1, but failed as the number of fields does not line up.

If I change the currency type to a string, it works just fine, but I want to keep it as an enumeration for another use case.

Any idea how can I fix that?

score 1 · Answer 1 · answered Apr 25 '23 at 14:41

class Currency extends Enumeration { ...

should be

object Currency extends Enumeration { ...

and

case class MyObj(country: String, currency: Currency)

should be

case class MyObj(country: String, currency: Currency.Currency)

or

case class MyObj(country: String, currency: Currency.Value)

dsProduct.show()

//+-------+--------+
//|country|currency|
//+-------+--------+
//| France|     EUR|
//|    USA|  DOLLAR|
//|Germany|     EUR|
//+-------+--------+

How to create Spark Dataset or Dataframe from case classes that contains Enums

Failing to convert a dataframe to a dataset of object with an enumeration custom field

1 Answers1