0

Is there a way to flatten a struct object in PySpark?

root
 |-- key: struct (nullable = true)
 |    |-- id: string (nullable = true)
 |    |-- type: string (nullable = true)
 |    |-- date: string (nullable = true)

I found this SO post: How to flatten a struct in a Spark dataframe? to be similar, except I didn't know how to translate the answer(s) from Spark to PySpark.

SOLUTION: For others, here is the full code solution that I was looking for:

df.select(col("key.id"), 
          col("key.type"),
          col("key.date"))
user2205916
  • 3,196
  • 11
  • 54
  • 82
  • 3
    The link you mentioned is useful for this. Read the article completely, Let me know if still needed help. – Manu Gupta Jan 22 '20 at 04:16

1 Answers1

1

Syntax is same for spark in Scala, Java and Python. To access fields of struct you need to use dot(.) operator like below.

df.select(col("key.id"))

The above line of code will fetch only id.

Manoj Kumar Dhakad
  • 1,862
  • 1
  • 12
  • 26