I have a data frame df with the schema that looks like -
root
|-- users: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id: string (nullable = true)
| | |-- ok: boolean (nullable = true)
| | |-- attributes: struct (nullable = true)
| | | |-- array1: array (nullable = true)
| | | | |-- element: string (containsNull = true)
| | | |-- groupid: string (nullable = true)
| | | |-- array2: array (nullable = true)
| | | | |-- element: string (containsNull = true)
| | | |-- array3: array (nullable = true)
| | | | |-- element: string (containsNull = true)
| | | |-- array4: array (nullable = true)
| | | | |-- element: string (containsNull = true)
I want to access and analyze values of array1, array2, array3, array4. I am trying by:
df.users.attributes.array1
It gives me an error -
AttributeError: 'Series' object has no attribute 'attributes'
How will I be able to access the values/data within these arrays - array1, array2, array3 and array4?