Been struggling with this for a while and still can't make my mind around it.
I'm trying to flatMap (or use .withColumn
with explode()
instead as it seems easier so I don't lose column names), but I'm always getting the error UDTF expected 2 aliases but got 'name' instead
.
I've revisited some similar questions but none of them shed some light as their schemas are too simple.
The column of the schema I'm trying to perform flatMap with is the following one...
StructField(CarMake,
StructType(
List(
StructField(
Models,
MapType(
StringType,
StructType(
List(
StructField(Variant, StringType),
StructField(GasOrPetrol, StringType)
)
)
)
)
)
))
What I'm trying to achieve by calling explode() like this...
carsDS
.withColumn("modelsAndVariant", explode($"carmake.models"))
...is to achieve a Row without that nested Map and Struct so I get as many rows as variants are.
Example input
(country: Sweden, carMake: Volvo, carMake.Models: {"850": ("T5", "petrol"), "V50": ("T5", "petrol")})
Example output
(country: Sweden, carMake: Volvo, Model: "850", Variant: "T5", GasOrPetrol: "petrol"}
(country: Sweden, carMake: Volvo, Model: "V50", Variant: "T5", GasOrPetrol: "petrol"}
Basically leaving the nested Map with its inner Struct all in the same level.