I have a pyspark dataframe where some of its columns contain array of string (and one column contains nested array). As a result, I cannot write the dataframe to a csv.
Here is an example of the dataframe that I am dealing with -
+-------+--------------------+---------+
|ID | emailed| clicked
+-------+--------------------+---------+
|9000316|[KBR, NRT, AOR] |[[AOR]]
|9000854|[KBR, NRT, LAX] | Null
|9001996|[KBR, JFK] |[[JFK]]
+-------+--------------------+---------+
I would like to get the following structure, to be saved as a csv.
+-------+--------------------+---------+
|ID | emailed| clicked
+-------+--------------------+---------+
|9000316|KBR, NRT, AOR | AOR
|9000854|KBR, NRT, LAX | Null
|9001996|KBR, JFK | JFK
+-------+--------------------+---------+
I am very new to pyspark. Your help is greatly appreciated. Thank you!