Given a schema like:
root
|-- first_name: string
|-- last_name: string
|-- degrees: array
| |-- element: struct
| | |-- school: string
| | |-- advisors: struct
| | | |-- advisor1: string
| | | |-- advisor2: string
How can I get a schema like:
root
|-- first_name: string
|-- last_name: string
|-- degrees: array
| |-- element: struct
| | |-- school: string
| | |-- advisor1: string
| | |-- advisor2: string
Currently, I explode the array, flatten the structure by selecting advisor.*
and then group by first_name, last_name
and rebuild the array with collect_list
. I'm hoping there's a cleaner/shorter way to do this. Currently, there's a lot of pain renaming some fields and stuff that I don't want to get into here. Thanks!