I have a data frame with following type
col1|col2|col3|col4
xxxx|yyyy|zzzz|[, 111, por-BR, 2222]
I want my output to be following type
+----+----+----+-----+
|col1|col2|col3|col4 |
+----+----+----+-----+
| xx| yy| zz| 1111|
| xx| yy| zz| 2222|
+----+----+----+-----+
col4 is an array and I want to appear in the same column (or different) but on one column
Following is my actual schema:
data1:pyspark.sql.dataframe.DataFrame
col1:string
col2:string
col3:string
col4:array
element:struct
colDept:string
I managed to do below
df = df.withColumn("col5", df["col4"].getItem(1)).withColumn("col4", df["col4"].getItem(0))
df.show()
+----+----+----+----+----+
|col1|col2|col3|col4|col5|
+----+----+----+----+----+
| xx| yy| zz|1111|2222|
+----+----+----+----+----+
but I want like this if can any can help please?
#+----+----+----+-----+
#|col1|col2|col3|col4 |
#+----+----+----+-----+
#| xx| yy| zz| 1111|
#| xx| yy| zz| 2222|
#+----+----+----+-----+