0

I have a df of schema

|-- Data: struct (nullable = true)
|    |-- address_billing: struct (nullable = true)
|    |    |-- address1: string (nullable = true)
|    |    |-- address2: string (nullable = true)
|    |-- address_shipping: struct (nullable = true)
|    |    |-- address1: string (nullable = true)
|    |    |-- address2: string (nullable = true)
|    |    |-- city: string (nullable = true)
|    |-- cancelled_initiator: string (nullable = true)
|    |-- cancelled_reason: string (nullable = true)
|    |-- statuses: array (nullable = true)
|    |    |-- element: string (containsNull = true)
|    |-- store_code: string (nullable = true)
|    |-- store_name: string (nullable = true)
|    |-- tax_code: string (nullable = true)
|    |-- total: string (nullable = true)
|    |-- updated_at: string (nullable = true)

I need to extract its all fields in separate columns without manually giving name.

Is there any way by which we can do this? I tried:

val df2=df1.select(df1.col("Data.*"))

but got the error

org.apache.spark.sql.AnalysisException: No such struct field * in address_billing, address_shipping,....

Also, Can anyone suggest to me how to add a prefix to all these columns, as the some of the columns name may be the same. Output should be like address_billing_address1 address_billing_address2 . . .

Etisha
  • 307
  • 6
  • 16
  • 1
    Does this answer your question? [Exploding nested Struct in Spark dataframe](https://stackoverflow.com/questions/39275816/exploding-nested-struct-in-spark-dataframe) – mazaneicha Feb 12 '20 at 13:01
  • @mazaneicha `explode` is not required here. `Explode` requires only if nested structure is array. – Giri Feb 12 '20 at 16:16
  • @mazaneicha How to add the prefix to all these columns which we extracted out? – Etisha Feb 13 '20 at 06:24

1 Answers1

0

Just change df1.col to col. Either of these should work:

df1.select(col("Data.*"))
df1.select($"Data.*")
df1.select("Data.*")
David Vrba
  • 2,984
  • 12
  • 16