Extract struct fields in Spark scala

Question

I have a df of schema

|-- Data: struct (nullable = true)
|    |-- address_billing: struct (nullable = true)
|    |    |-- address1: string (nullable = true)
|    |    |-- address2: string (nullable = true)
|    |-- address_shipping: struct (nullable = true)
|    |    |-- address1: string (nullable = true)
|    |    |-- address2: string (nullable = true)
|    |    |-- city: string (nullable = true)
|    |-- cancelled_initiator: string (nullable = true)
|    |-- cancelled_reason: string (nullable = true)
|    |-- statuses: array (nullable = true)
|    |    |-- element: string (containsNull = true)
|    |-- store_code: string (nullable = true)
|    |-- store_name: string (nullable = true)
|    |-- tax_code: string (nullable = true)
|    |-- total: string (nullable = true)
|    |-- updated_at: string (nullable = true)

I need to extract its all fields in separate columns without manually giving name.

Is there any way by which we can do this? I tried:

val df2=df1.select(df1.col("Data.*"))

but got the error

org.apache.spark.sql.AnalysisException: No such struct field * in address_billing, address_shipping,....

Also, Can anyone suggest to me how to add a prefix to all these columns, as the some of the columns name may be the same. Output should be like address_billing_address1 address_billing_address2 . . .

Does this answer your question? [Exploding nested Struct in Spark dataframe](https://stackoverflow.com/questions/39275816/exploding-nested-struct-in-spark-dataframe) — mazaneicha, Feb 12 '20 at 13:01
@mazaneicha `explode` is not required here. `Explode` requires only if nested structure is array. — Giri, Feb 12 '20 at 16:16
@mazaneicha How to add the prefix to all these columns which we extracted out? — Etisha, Feb 13 '20 at 06:24

score 0 · Answer 1 · answered Feb 12 '20 at 18:37

0

Just change df1.col to col. Either of these should work:

df1.select(col("Data.*"))

df1.select($"Data.*")

df1.select("Data.*")

answered Feb 12 '20 at 18:37

David Vrba

2,984
12
16

How to add the prefix to all these columns which we extracted out? – Etisha Feb 13 '20 at 06:27

Extract struct fields in Spark scala

1 Answers1